Extracting lxml xpath for html table Extracting lxml xpath for html table python python

Extracting lxml xpath for html table


You are probably looking at the HTML in Firebug, correct? The browser will insert the implicit tag <tbody> when it is not present in the document. The lxml library will only process the tags present in the raw HTML string.

Omit the tbody level in your XPath. For example, this works:

tree = lxml.html.fromstring(raw_html)tree.xpath('//table[@class="quotes"]/tr')[<Element tr at 1014206d0>, <Element tr at 101420738>, <Element tr at 1014207a0>]