jquery-like HTML parsing in Python?

python jquery css-selectors html-parsing

If you are fluent with BeautifulSoup, you could just add soupselect to your libs.
Soupselect is a CSS selector extension for BeautifulSoup.

Usage:

from bs4 import BeautifulSoup as Soupfrom soupselect import selectimport urllibsoup = Soup(urllib.urlopen('http://slashdot.org/'))select(soup, 'div.title h3')

    [<h3><span><a href='//science.slashdot.org/'>Science</a>:</span></h3>,     <h3><a href='//slashdot.org/articles/07/02/28/0120220.shtml'>Star Trek</h3>,    ..]

python jquery css-selectors html-parsing

Consider PyQuery:

http://packages.python.org/pyquery/

>>> from pyquery import PyQuery as pq>>> from lxml import etree>>> import urllib>>> d = pq("<html></html>")>>> d = pq(etree.fromstring("<html></html>"))>>> d = pq(url='http://google.com/')>>> d = pq(url='http://google.com/', opener=lambda url: urllib.urlopen(url).read())>>> d = pq(filename=path_to_html_file)>>> d("#hello")[<p#hello.hello>]>>> p = d("#hello")>>> p.html()'Hello world !'>>> p.html("you know <a href='http://python.org/'>Python</a> rocks")[<p#hello.hello>]>>> p.html()u'you know <a href="http://python.org/">Python</a> rocks'>>> p.text()'you know Python rocks'

python jquery css-selectors html-parsing

The lxml library supports CSS selectors.

CodeHunter

jquery-like HTML parsing in Python?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last