Java: Most efficient method to iterate over all elements in a org.w3c.dom.Document? Java: Most efficient method to iterate over all elements in a org.w3c.dom.Document? xml xml

Java: Most efficient method to iterate over all elements in a org.w3c.dom.Document?


Basically you have two ways to iterate over all elements:

1. Using recursion (the most common way I think):

public static void main(String[] args) throws SAXException, IOException,        ParserConfigurationException, TransformerException {    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory        .newInstance();    DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();    Document document = docBuilder.parse(new File("document.xml"));    doSomething(document.getDocumentElement());}public static void doSomething(Node node) {    // do something with the current node instead of System.out    System.out.println(node.getNodeName());    NodeList nodeList = node.getChildNodes();    for (int i = 0; i < nodeList.getLength(); i++) {        Node currentNode = nodeList.item(i);        if (currentNode.getNodeType() == Node.ELEMENT_NODE) {            //calls this method for all the children which is Element            doSomething(currentNode);        }    }}

2. Avoiding recursion using getElementsByTagName() method with * as parameter:

public static void main(String[] args) throws SAXException, IOException,        ParserConfigurationException, TransformerException {    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory            .newInstance();    DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();    Document document = docBuilder.parse(new File("document.xml"));    NodeList nodeList = document.getElementsByTagName("*");    for (int i = 0; i < nodeList.getLength(); i++) {        Node node = nodeList.item(i);        if (node.getNodeType() == Node.ELEMENT_NODE) {            // do something with the current element            System.out.println(node.getNodeName());        }    }}

I think these ways are both efficient.
Hope this helps.


for (int i = 0; i < nodeList.getLength(); i++)

change to

for (int i = 0, len = nodeList.getLength(); i < len; i++)

to be more efficient.

The second way of javanna answer may be the best as it tends to use a flatter, predictable memory model.


I also stumbled over this problem recently. Here is my solution.I wanted to avoid recursion, so I used a while loop.

Because of the adds and removes in arbitrary places on the list,I went with the LinkedList implementation.

/* traverses tree starting with given node */  private static List<Node> traverse(Node n)  {    return traverse(Arrays.asList(n));  }  /* traverses tree starting with given nodes */  private static List<Node> traverse(List<Node> nodes)  {    List<Node> open = new LinkedList<Node>(nodes);    List<Node> visited = new LinkedList<Node>();    ListIterator<Node> it = open.listIterator();    while (it.hasNext() || it.hasPrevious())    {      Node unvisited;      if (it.hasNext())        unvisited = it.next();      else        unvisited = it.previous();      it.remove();      List<Node> children = getChildren(unvisited);      for (Node child : children)        it.add(child);      visited.add(unvisited);    }    return visited;  }  private static List<Node> getChildren(Node n)  {    List<Node> children = asList(n.getChildNodes());    Iterator<Node> it = children.iterator();    while (it.hasNext())      if (it.next().getNodeType() != Node.ELEMENT_NODE)        it.remove();    return children;  }  private static List<Node> asList(NodeList nodes)  {    List<Node> list = new ArrayList<Node>(nodes.getLength());    for (int i = 0, l = nodes.getLength(); i < l; i++)      list.add(nodes.item(i));    return list;  }