convert xml to python dict convert xml to python dict xml xml

convert xml to python dict


You can make use of xmltodict module:

import xmltodictmessage = """<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"""print xmltodict.parse(message)['note']

which produces an OrderedDict:

OrderedDict([(u'to', u'Tove'), (u'from', u'Jani'), (u'heading', u'Reminder'), (u'body', u"Don't forget me this weekend!")])

which can be converted to dict if order doesn't matter:

print dict(xmltodict.parse(message)['note'])

Prints:

{u'body': u"Don't forget me this weekend!", u'to': u'Tove', u'from': u'Jani', u'heading': u'Reminder'}


You should checkout

https://github.com/martinblech/xmltodict

I think it is one of the best standard handlers for xml to dict I have seen.

However I should warn you xml and dict are not absolutely compatible data structures


You'd think that by now we'd have a good answer to this one, but we apparently didn't.After reviewing half of dozen of similar questions on stackoverflow, here is what worked for me:

from lxml import etree# arrow is an awesome lib for dealing with dates in pythonimport arrow# converts an etree to dict, useful to convert xml to dictdef etree2dict(tree):    root, contents = recursive_dict(tree)    return {root: contents}def recursive_dict(element):    if element.attrib and 'type' in element.attrib and element.attrib['type'] == "array":        return element.tag, [(dict(map(recursive_dict, child)) or getElementValue(child)) for child in element]    else:        return element.tag, dict(map(recursive_dict, element)) or getElementValue(element)def getElementValue(element):    if element.text:        if element.attrib and 'type' in element.attrib:            attr_type = element.attrib.get('type')            if attr_type == 'integer':                return int(element.text.strip())            if attr_type == 'float':                return float(element.text.strip())            if attr_type == 'boolean':                return element.text.lower().strip() == 'true'            if attr_type == 'datetime':                return arrow.get(element.text.strip()).timestamp        else:            return element.text    elif element.attrib:        if 'nil' in element.attrib:            return None        else:            return element.attrib    else:        return None

and this is how you use it:

from lxml import etreemessage="""<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"''tree = etree.fromstring(message)etree2dict(tree)

Hope it helps :-)