How to get BeautifulSoup 4 to respect a self-closing tag?
To parse XML you pass in “xml” as the second argument to the BeautifulSoup constructor.
soup = bs4.BeautifulSoup(S, 'xml')
You’ll need to have lxml installed.
You don't need to pass selfClosingTags
anymore:
In [1]: import bs4In [2]: S = '''<foo> <bar a="3"/> </foo>'''In [3]: soup = bs4.BeautifulSoup(S, 'xml')In [4]: print soup.prettify()<?xml version="1.0" encoding="utf-8"?><foo> <bar a="3"/></foo>