Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"
Instead of modifying the XML document itself, it's best to parse it and then modify the tags in the result. This way you can handle multiple namespaces and namespace aliases:
from io import StringIO # for Python 2 import from StringIO insteadimport xml.etree.ElementTree as ET# instead of ET.fromstring(xml)it = ET.iterparse(StringIO(xml))for _, el in it: prefix, has_namespace, postfix = el.tag.partition('}') if has_namespace: el.tag = postfix # strip all namespacesroot = it.root
This is based on the discussion here:http://bugs.python.org/issue18304
Update: rpartition
instead of partition
makes sure you get the tag name in postfix
even if there is no namespace. Thus you could condense it:
for _, el in it: _, _, el.tag = el.tag.rpartition('}') # strip ns
If you remove the xmlns attribute from the xml before parsing it then there won't be a namespace prepended to each tag in the tree.
import rexmlstring = re.sub(' xmlns="[^"]+"', '', xmlstring, count=1)
The answers so far explicitely put the namespace value in the script. For a more generic solution, I would rather extract the namespace from the xml:
import redef get_namespace(element): m = re.match('\{.*\}', element.tag) return m.group(0) if m else ''
And use it in find method:
namespace = get_namespace(tree.getroot())print tree.find('./{0}parent/{0}version'.format(namespace)).text