Get text content of an HTML element using XPath?

You want to select all descendant text, not just child text:

//div[a[contains(., "Add to cart")]]/p//text()

Note the double slash between p and text() there.

This potentially will also include a lot of inter-tag whitespace though, you you'll need to clean that up. Example using lxml:

>>> import lxml.etree as ET>>> tree = ET.fromstring('''<div>... <div>...     <p>...     <span class="abc">Monitor</span> <b>$300</b>...     </p>...     <a href="/add">Add to cart</a>... </div>... <div>...     <p>...     <span class="abc">Keyboard</span> $20 ...     </p>...     <a href="/add">Add to cart</a>... </div>... </div>''')>>> tree.xpath('//div[a[contains(., "Add to cart")]]/p//text()')['\n    ', 'Monitor', ' ', '$300', '\n    ', '\n    ', 'Keyboard', ' $20 \n    ']>>> res = _>>> [txt for txt in (txt.strip() for txt in res) if txt]['Monitor', '$300', 'Keyboard', '$20']

CodeHunter

Get text content of an HTML element using XPath?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last