xml2js: how is the output? xml2js: how is the output? xml xml

xml2js: how is the output?


xml2js has an un-enviable task: convert XML to JSON in a way that can be reversed, without knowing the schema in advance. It seems obvious, at first:

<name>Fred</name> → { name: "Fred" }<chacha /> → { chacha: null }

Easy so far, right? How about this, though?

<x><y>z</y><x>

Removing the human friendly names drives home the uncertainty facing xml2js. At first, you might think this is quite reasonable:

{ x: { y: "z" } }

Later, you trip over this XML text and realise your guessed-at schema was wrong:

<x><y>z</y><y>z2</y></x>

Uh oh. Maybe we should have used an array. At least all the members have the same tag:

{ x: [ "z", "z2" ] }

Inevitably, though, that turns out to be short-sighted:

<x><y>z</y><y>z2</y><m>n</m>happy</x>

Uh...

{ x: [ { y: "z" }, { y : "z2" }, { m: "n" }, "happy" ] }

... and then someone polishes you off with some attributes and XML namespaces.

The way to construct a more concise output schema feels obvious to you. You can infer details from the tag and attribute names. You understand it.

The library does not share that understanding.

If the library doesn't know the schema, it must either "use and abuse" arrays, extra layers of objects, special attribute names, or all three.

The only alternative is to employ a variable output schema. That keeps it simple at first, as we saw above, but you'll quickly find yourself writing a great deal of conditional code. Consider what happens if children with the same tag name are collapsed into a list, but only if there are more than one:

if (Array.isArray(x.y)) {    processTheYChildren(x.y);} else if (typeof(x.y) === 'object') {    // only one child; construct an array on the fly because my converter didn't    processTheYChildren([x.y]);} else ...

TL;DR: it's harder than it looks. Read the Open311 JSON and XML Conversion page for details of other JSON-side representations. All "use and abuse" arrays, extra layers of objects, members with names that didn't appear in the original XML, or all three.


As xml2js' documentation states, you can configure the parser to not abuse of arrays, by setting the property explicitArray to false (important: it has to be a boolean value as the string "false" will just not work!)

Example:

var parser = new xml2js.Parser({explicitArray : false});

This way, you should be able to access your JSON properties in a much easier way. I hope this helps anyone.


The JSON that comes back isn't too JavaScript friendly. I've written a helper function that can make it easier to work with.

Be sure to read it before using it so that you understand what it does.

xml.parseString(xmlString, function(err, results){    if(err) throw err    results = cleanXML(results);});var cleanXML = function(xml){    var keys = Object.keys(xml),        o = 0, k = keys.length,        node, value, singulars,        l = -1, i = -1, s = -1, e = -1,        isInt = /^-?\s*\d+$/,        isDig = /^(-?\s*\d*\.?\d*)$/,        radix = 10;    for(; o < k; ++o){        node = keys[o];        if(xml[node] instanceof Array && xml[node].length === 1){            xml[node] = xml[node][0];        }        if(xml[node] instanceof Object){            value = Object.keys(xml[node]);            if(value.length === 1){                l = node.length;                singulars = [                    node.substring(0, l - 1),                    node.substring(0, l - 3) + 'y'                ];                i = singulars.indexOf(value[0]);                if(i !== -1){                    xml[node] = xml[node][singulars[i]];                }            }        }        if(typeof(xml[node]) === 'object'){            xml[node] = cleanXML(xml[node]);        }        if(typeof(xml[node]) === 'string'){            value = xml[node].trim();            if(value.match(isDig)){                if(value.match(isInt)){                    if(Math.abs(parseInt(value, radix)) <= Number.MAX_SAFE_INTEGER){                        xml[node] = parseInt(value, radix);                    }                }else{                    l = value.length;                    if(l <= 15){                        xml[node] = parseFloat(value);                    }else{                        for(i = 0, s = -1, e = -1; i < l && e - s <= 15; ++i){                            if(value.charAt(i) > 0){                                if(s === -1){                                    s = i;                                }else{                                    e = i;                                }                            }                        }                        if(e - s <= 15){                            xml[node] = parseFloat(value);                        }                    }                }            }        }    }    return xml;};

Examples:

{  queries: { query: [ {}, {}, {} ] }}

becomes

{  queries: [ {}, {}, {} ]}

and

{  types: { type: [ {}, {}, {} ] }}

becomes

{  types: [ {}, {}, {} ]}

It will also safely convert integers/floating points.

Edit: Replaced for... in with for