Parsing XML in Python with regex
You normally don't want to use re.match
. Quoting from the docs:
If you want to locate a match anywhere in string, use search() instead (see also search() vs. match()).
Note:
>>> print re.match('>.*<', line)None>>> print re.search('>.*<', line)<_sre.SRE_Match object at 0x10f666238>>>> print re.search('>.*<', line).group(0)>PLAINSBORO, NJ 08536-1906<
Also, why parse XML with regex when you can use something like BeautifulSoup
:).
>>> from bs4 import BeautifulSoup as BS>>> line='<City_State>PLAINSBORO, NJ 08536-1906</City_State>'>>> soup = BS(line)>>> print soup.find('city_state').textPLAINSBORO, NJ 08536-1906