merging xml files using python's ElementTree
Although this is mostly a duplicate and the answer can be found here, I already did this so i can share this python code:
import os, os.path, sysimport globfrom xml.etree import ElementTreedef run(files): xml_files = glob.glob(files +"/*.xml") xml_element_tree = None for xml_file in xml_files: data = ElementTree.parse(xml_file).getroot() # print ElementTree.tostring(data) for result in data.iter('results'): if xml_element_tree is None: xml_element_tree = data insertion_point = xml_element_tree.findall("./results")[0] else: insertion_point.extend(result) if xml_element_tree is not None: print ElementTree.tostring(xml_element_tree)
However this question contains another problem not present in the other post. The sample XML files are not valid XML so its not possible to have a XML tag with:
<sample="1"> ...</sample>
is not possible change to something like:
<sample id="1"> ...</sample>
You could try this solution:
import globfrom xml.etree import ElementTreedef newRunRun(folder): xml_files = glob.glob(folder+"/*.xml") node = None for xmlFile in xml_files: tree = ElementTree.parse(xmlFile) root = tree.getroot() if node is None: node = root else: elements = root.find("./results") for element in elements._children: node[1].append(element) print ElementTree.tostring(node)folder = "resources"newRunRun(folder)
As you can see, I´m using the first doc as a container, inserting inside it the elements of others docs... This is the ouput generated:
<sample id="1"><workflow value="x" version="1" /> <results> <result type="Q"> <result_data type="value" value="11" /> <result_data type="value" value="21" /> <result_data type="value" value="13" /> <result_data type="value" value="12" /> <result_data type="value" value="15" /> </result> <result type="T"> <result_data type="value" value="19" /> <result_data type="value" value="15" /> <result_data type="value" value="14" /> <result_data type="value" value="13" /> <result_data type="value" value="12" /> </result> </results></sample>
Using the version: Python 2.7.15