convert xml to python dict
You can make use of xmltodict
module:
import xmltodictmessage = """<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"""print xmltodict.parse(message)['note']
which produces an OrderedDict
:
OrderedDict([(u'to', u'Tove'), (u'from', u'Jani'), (u'heading', u'Reminder'), (u'body', u"Don't forget me this weekend!")])
which can be converted to dict if order doesn't matter:
print dict(xmltodict.parse(message)['note'])
Prints:
{u'body': u"Don't forget me this weekend!", u'to': u'Tove', u'from': u'Jani', u'heading': u'Reminder'}
You should checkout
https://github.com/martinblech/xmltodict
I think it is one of the best standard handlers for xml to dict I have seen.
However I should warn you xml and dict are not absolutely compatible data structures
You'd think that by now we'd have a good answer to this one, but we apparently didn't.After reviewing half of dozen of similar questions on stackoverflow, here is what worked for me:
from lxml import etree# arrow is an awesome lib for dealing with dates in pythonimport arrow# converts an etree to dict, useful to convert xml to dictdef etree2dict(tree): root, contents = recursive_dict(tree) return {root: contents}def recursive_dict(element): if element.attrib and 'type' in element.attrib and element.attrib['type'] == "array": return element.tag, [(dict(map(recursive_dict, child)) or getElementValue(child)) for child in element] else: return element.tag, dict(map(recursive_dict, element)) or getElementValue(element)def getElementValue(element): if element.text: if element.attrib and 'type' in element.attrib: attr_type = element.attrib.get('type') if attr_type == 'integer': return int(element.text.strip()) if attr_type == 'float': return float(element.text.strip()) if attr_type == 'boolean': return element.text.lower().strip() == 'true' if attr_type == 'datetime': return arrow.get(element.text.strip()).timestamp else: return element.text elif element.attrib: if 'nil' in element.attrib: return None else: return element.attrib else: return None
and this is how you use it:
from lxml import etreemessage="""<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"''tree = etree.fromstring(message)etree2dict(tree)
Hope it helps :-)