Grep and Sed Equivalent for XML Command Line Processing
I've found xmlstarlet to be pretty good at this sort of thing.
http://xmlstar.sourceforge.net/
Should be available in most distro repositories, too. An introductory tutorial is here:
Some promising tools:
nokogiri: parsing HTML/XML DOMs in ruby using XPath & CSS selectors
hpricot: deprecated
fxgrep:Uses its own XPath-like syntax to query documents. Written in SML, so installation may be difficult.
LT XML:XML toolkit derived from SGML tools, including
sggrep
,sgsort
,xmlnorm
and others. Uses its own query syntax. The documentation is very formal. Written in C. LT XML 2 claims support of XPath, XInclude and other W3C standards.xmlgrep2:simple and powerful searching with XPath. Written in Perl using XML::LibXML and libxml2.
XQSharp:Supports XQuery, the extension to XPath. Written for the .NET Framework.
xml-coreutils:Laird Breyer's toolkit equivalent to GNU coreutils. Discussed in an interesting essay on what the ideal toolkit should include.
xmldiff:Simple tool for comparing two xml files.
xmltk: doesn't seem to have package in debian, ubuntu, fedora, or macports, hasn't had a release since 2007, and uses non-portable build automation.
xml-coreutils seems the best documented and most UNIX-oriented.
To Joseph Holsten's excellent list, I add the xpath command-line script which comes with Perl library XML::XPath. A great way to extract information from XML files:
xpath -q -e '/entry[@xml:lang="fr"]' *xml