jquery-like HTML parsing in Python?
If you are fluent with BeautifulSoup, you could just add soupselect to your libs.
Soupselect is a CSS selector extension for BeautifulSoup.
Usage:
from bs4 import BeautifulSoup as Soupfrom soupselect import selectimport urllibsoup = Soup(urllib.urlopen('http://slashdot.org/'))select(soup, 'div.title h3')
[<h3><span><a href='//science.slashdot.org/'>Science</a>:</span></h3>, <h3><a href='//slashdot.org/articles/07/02/28/0120220.shtml'>Star Trek</h3>, ..]
Consider PyQuery:
http://packages.python.org/pyquery/
>>> from pyquery import PyQuery as pq>>> from lxml import etree>>> import urllib>>> d = pq("<html></html>")>>> d = pq(etree.fromstring("<html></html>"))>>> d = pq(url='http://google.com/')>>> d = pq(url='http://google.com/', opener=lambda url: urllib.urlopen(url).read())>>> d = pq(filename=path_to_html_file)>>> d("#hello")[<p#hello.hello>]>>> p = d("#hello")>>> p.html()'Hello world !'>>> p.html("you know <a href='http://python.org/'>Python</a> rocks")[<p#hello.hello>]>>> p.html()u'you know <a href="http://python.org/">Python</a> rocks'>>> p.text()'you know Python rocks'