How to convert XML to JSON in Python? [duplicate] How to convert XML to JSON in Python? [duplicate] xml xml

How to convert XML to JSON in Python? [duplicate]


xmltodict (full disclosure: I wrote it) can help you convert your XML to a dict+list+string structure, following this "standard". It is Expat-based, so it's very fast and doesn't need to load the whole XML tree in memory.

Once you have that data structure, you can serialize it to JSON:

import xmltodict, jsono = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')json.dumps(o) # '{"e": {"a": ["text", "text"]}}'


Soviut's advice for lxml objectify is good. With a specially subclassed simplejson, you can turn an lxml objectify result into json.

import simplejson as jsonimport lxmlclass objectJSONEncoder(json.JSONEncoder):  """A specialized JSON encoder that can handle simple lxml objectify types      >>> from lxml import objectify      >>> obj = objectify.fromstring("<Book><price>1.50</price><author>W. Shakespeare</author></Book>")             >>> objectJSONEncoder().encode(obj)      '{"price": 1.5, "author": "W. Shakespeare"}'        """    def default(self,o):        if isinstance(o, lxml.objectify.IntElement):            return int(o)        if isinstance(o, lxml.objectify.NumberElement) or isinstance(o, lxml.objectify.FloatElement):            return float(o)        if isinstance(o, lxml.objectify.ObjectifiedDataElement):            return str(o)        if hasattr(o, '__dict__'):            #For objects with a __dict__, return the encoding of the __dict__            return o.__dict__        return json.JSONEncoder.default(self, o)

See the docstring for example of usage, essentially you pass the result of lxml objectify to the encode method of an instance of objectJSONEncoder

Note that Koen's point is very valid here, the solution above only works for simply nested xml and doesn't include the name of root elements. This could be fixed.

I've included this class in a gist here: http://gist.github.com/345559


I think the XML format can be so diverse that it's impossible to write a code that could do this without a very strict defined XML format. Here is what I mean:

<persons>    <person>        <name>Koen Bok</name>        <age>26</age>    </person>    <person>        <name>Plutor Heidepeen</name>        <age>33</age>    </person></persons>

Would become

{'persons': [    {'name': 'Koen Bok', 'age': 26},    {'name': 'Plutor Heidepeen', 'age': 33}]}

But what would this be:

<persons>    <person name="Koen Bok">        <locations name="defaults">            <location long=123 lat=384 />        </locations>    </person></persons>

See what I mean?

Edit: just found this article: http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html