convert html table to csv in python convert html table to csv in python selenium selenium

convert html table to csv in python


Using the csv module and selenium selectors would probably be more convenient here:

import csvfrom selenium import webdriverdriver = webdriver.Firefox()driver.get("http://example.com/")table = driver.find_element_by_css_selector("#tableid")with open('eggs.csv', 'w', newline='') as csvfile:    wr = csv.writer(csvfile)    for row in table.find_elements_by_css_selector('tr'):        wr.writerow([d.text for d in row.find_elements_by_css_selector('td')])


Without access to the table you're actually trying to scrape, I used this example:

<table><thead><tr>    <td>Header1</td>    <td>Header2</td>    <td>Header3</td></tr></thead>  <tr>    <td>Row 11</td>    <td>Row 12</td>    <td>Row 13</td></tr><tr>    <td>Row 21</td>    <td>Row 22</td>    <td>Row 23</td></tr><tr>    <td>Row 31</td>    <td>Row 32</td>    <td>Row 33</td></tr></table>

and scraped it using:

from bs4 import BEautifulSoup as BScontent = #contents of that tablesoup = BS(content, 'html5lib')rows = [tr.findAll('td') for tr in soup.findAll('tr')]

This rows object is a list of lists:

[    [<td>Header1</td>, <td>Header2</td>, <td>Header3</td>],    [<td>Row 11</td>, <td>Row 12</td>, <td>Row 13</td>],    [<td>Row 21</td>, <td>Row 22</td>, <td>Row 23</td>],    [<td>Row 31</td>, <td>Row 32</td>, <td>Row 33</td>]]

...and you can write it to a file:

for it in rows:with open('result.csv', 'a') as f:    f.write(", ".join(str(e).replace('<td>','').replace('</td>','') for e in it) + '\n')

which looks like this:

Header1, Header2, Header3Row 11, Row 12, Row 13Row 21, Row 22, Row 23Row 31, Row 32, Row 33