strip comments from xml file and pretty-print it strip comments from xml file and pretty-print it xml xml

strip comments from xml file and pretty-print it


you can use tidy

$ tidy -quiet -asxml -xml -indent -wrap 1024 --hide-comments 1 tomcat-users.xml<?xml version='1.0' encoding='utf-8'?><tomcat-users>  <user username="qwerty" password="ytrewq" roles="manager-gui" /></tomcat-users>


Run your XML through an identity transform XSLT, with an empty template for comments.

All of the XML content, except for the comments, will be passed through to the output.

In order to niecely format the output, set the output @indent="yes":

<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/><!--Match on Attributes, Elements, text nodes, and Processing Instructions--><xsl:template match="@*| * | text() | processing-instruction()">   <xsl:copy>      <xsl:apply-templates select="@*|node()"/>   </xsl:copy></xsl:template><!--Empty template prevents comments from being copied into the output --><xsl:template match="comment()"/></xsl:stylesheet>


You might want to look at the xmllint tool. It has several options (one of which --format will do a pretty print), but I can't figure out how to remove the comments using this tool.

Also, check out XMLStarlet, a bunch of command line tools to do anything you would want to with xml. Then do:

xml c14n --without-comments # XML file canonicalization w/o comments

EDIT: OP eventually used this line:

xmlstarlet c14n --without-comments old.xml > new.xml