How to update/modify an XML file in python? How to update/modify an XML file in python? xml xml

How to update/modify an XML file in python?


Using ElementTree:

import xml.etree.ElementTree# Open original fileet = xml.etree.ElementTree.parse('file.xml')# Append new tag: <a x='1' y='abc'>body text</a>new_tag = xml.etree.ElementTree.SubElement(et.getroot(), 'a')new_tag.text = 'body text'new_tag.attrib['x'] = '1' # must be str; cannot be an intnew_tag.attrib['y'] = 'abc'# Write back to file#et.write('file.xml')et.write('file_new.xml')

note: output written to file_new.xml for you to experiment, writing back to file.xml will replace the old content.

IMPORTANT: the ElementTree library stores attributes in a dict, as such, the order in which these attributes are listed in the xml text will NOT be preserved. Instead, they will be output in alphabetical order.(also, comments are removed. I'm finding this rather annoying)

ie: the xml input text <b y='xxx' x='2'>some body</b> will be output as <b x='2' y='xxx'>some body</b>(after alphabetising the order parameters are defined)

This means when committing the original, and changed files to a revision control system (such as SVN, CSV, ClearCase, etc), a diff between the 2 files may not look pretty.


Useful Python XML parsers:

  1. Minidom - functional but limited
  2. ElementTree - decent performance, more functionality
  3. lxml - high-performance in most cases, high functionality including real xpath support

Any of those is better than trying to update the XML file as strings of text.

What that means to you:

Open your file with an XML parser of your choice, find the node you're interested in, replace the value, serialize the file back out.


The quick and easy way, which you definitely should not do (see below), is to read the whole file into a list of strings using readlines(). I write this in case the quick and easy solution is what you're looking for.

Just open the file using open(), then call the readlines() method. What you'll get is a list of all the strings in the file. Now, you can easily add strings before the last element (just add to the list one element before the last). Finally, you can write these back to the file using writelines().

An example might help:

my_file = open(filename, "r")lines_of_file = my_file.readlines()lines_of_file.insert(-1, "This line is added one before the last line")my_file.writelines(lines_of_file)

The reason you shouldn't be doing this is because, unless you are doing something very quick n' dirty, you should be using an XML parser. This is a library that allows you to work with XML intelligently, using concepts like DOM, trees, and nodes. This is not only the proper way to work with XML, it is also the standard way, making your code both more portable, and easier for other programmers to understand.

Tim's answer mentioned checking out xml.dom.minidom for this purpose, which I think would be a great idea.