Nokogiri/Xpath namespace query Nokogiri/Xpath namespace query xml xml

Nokogiri/Xpath namespace query


All namespaces need to be registered when parsing. Nokogiri automatically registers namespaces on the root node. Any namespaces that are not on the root node you have to register yourself. This should work:

puts doc.xpath('//dc:title', 'dc' => "URI")

Alternately, you can remove namespaces altogether. Only do this if you are certain there will be no conflicting node names.

doc.remove_namespaces!puts doc.xpath('//title')


With properly registered prefix opf for 'http://www.idpf.org/2007/opf' namespace URI, and dc for 'URI', you need:

/*/opf:metadata/dc:title

Note: xmlns and xml are reserved prefixes that can't be bound to any other namespace URI than the built-in 'http://www.w3.org/2000/xmlns/' and 'http://www.w3.org/XML/1998/namespace'.


As an alternative to explicitly constructing a hash of namespace URIs, you can retrieve the namespace definitions from the xml element where they're defined.

Using your example:

# First grab the metadata node, because that's where "dc" is defined.metadata = doc.at_xpath('//xmlns:metadata')# Pass metadata's namespaces as the resolver.metadata.at_xpath('dc:title', metadata.namespaces)

Note that the second xpath could've also been:

doc.at_xpath('//dc:title', metadata.namespaces).to_s

But why search from the root when you have a nearer ancestor? Also, you should consider the namespace-defining element plus its children as the "scope" of the namespace. Searching a limited scope is less confusing and avoids subtle bugs.