How to strip whitespace-only text nodes from a DOM before serialization? How to strip whitespace-only text nodes from a DOM before serialization? xml xml

How to strip whitespace-only text nodes from a DOM before serialization?


You can find empty text nodes using XPath, then remove them programmatically like so:

XPathFactory xpathFactory = XPathFactory.newInstance();// XPath to find empty text nodes.XPathExpression xpathExp = xpathFactory.newXPath().compile(        "//text()[normalize-space(.) = '']");  NodeList emptyTextNodes = (NodeList)         xpathExp.evaluate(doc, XPathConstants.NODESET);// Remove each empty text node from document.for (int i = 0; i < emptyTextNodes.getLength(); i++) {    Node emptyTextNode = emptyTextNodes.item(i);    emptyTextNode.getParentNode().removeChild(emptyTextNode);}

This approach might be useful if you want more control over node removal than is easily achieved with an XSL template.


Try using the following XSL and the strip-space element to serialize your DOM:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  <xsl:output method="xml" omit-xml-declaration="yes"/>  <xsl:strip-space elements="*"/>  <xsl:template match="@*|node()">    <xsl:copy>     <xsl:apply-templates select="@*|node()"/>    </xsl:copy>  </xsl:template></xsl:stylesheet>

http://helpdesk.objects.com.au/java/how-do-i-remove-whitespace-from-an-xml-document


Below code deletes the comment nodes and text nodes with all empty spaces. If the text node has some value, value will be trimmed

public static void clean(Node node){  NodeList childNodes = node.getChildNodes();  for (int n = childNodes.getLength() - 1; n >= 0; n--)  {     Node child = childNodes.item(n);     short nodeType = child.getNodeType();     if (nodeType == Node.ELEMENT_NODE)        clean(child);     else if (nodeType == Node.TEXT_NODE)     {        String trimmedNodeVal = child.getNodeValue().trim();        if (trimmedNodeVal.length() == 0)           node.removeChild(child);        else           child.setNodeValue(trimmedNodeVal);     }     else if (nodeType == Node.COMMENT_NODE)        node.removeChild(child);  }}

Ref: http://www.sitepoint.com/removing-useless-nodes-from-the-dom/