Using pandas to read downloaded html file
I think you are on to the right track by using an html parser like beautiful soup. pandas.read_html() reads an html table not an html page.
You would want to do something like this...
from bs4 import BeautifulSoupimport pandas as pdtable = BeautifulSoup(open('C:/age0.html','r').read()).find('table')df = pd.read_html(table) #I think it accepts BeatifulSoup object #otherwise try str(table) as input
first of all install below packages for parsing purpose
- pip install BeautifulSoup4
- pip install lxml
- pip install html5lib
then use 'read_html' to read html table on any html page.
import pandas as pdspds_df = pds.read_html('C:/age0.html')pds_df[0]
I hope this will help.
Good Luck!!