Best way to compare 2 XML documents in Java Best way to compare 2 XML documents in Java xml xml

Best way to compare 2 XML documents in Java


Sounds like a job for XMLUnit

Example:

public class SomeTest extends XMLTestCase {  @Test  public void test() {    String xml1 = ...    String xml2 = ...    XMLUnit.setIgnoreWhitespace(true); // ignore whitespace differences    // can also compare xml Documents, InputSources, Readers, Diffs    assertXMLEqual(xml1, xml2);  // assertXMLEquals comes from XMLTestCase  }}


The following will check if the documents are equal using standard JDK libraries.

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();dbf.setNamespaceAware(true);dbf.setCoalescing(true);dbf.setIgnoringElementContentWhitespace(true);dbf.setIgnoringComments(true);DocumentBuilder db = dbf.newDocumentBuilder();Document doc1 = db.parse(new File("file1.xml"));doc1.normalizeDocument();Document doc2 = db.parse(new File("file2.xml"));doc2.normalizeDocument();Assert.assertTrue(doc1.isEqualNode(doc2));

normalize() is there to make sure there are no cycles (there technically wouldn't be any)

The above code will require the white spaces to be the same within the elements though, because it preserves and evaluates it. The standard XML parser that comes with Java does not allow you to set a feature to provide a canonical version or understand xml:space if that is going to be a problem then you may need a replacement XML parser such as xerces or use JDOM.


Xom has a Canonicalizer utility which turns your DOMs into a regular form, which you can then stringify and compare. So regardless of whitespace irregularities or attribute ordering, you can get regular, predictable comparisons of your documents.

This works especially well in IDEs that have dedicated visual String comparators, like Eclipse. You get a visual representation of the semantic differences between the documents.