Reading PASCAL VOC annotations in python
That's a quite easy solution for your problem:
This will return your box coordinates in a nested list [xmin, ymin, xmax, ymax] and the filenameOnce I struggled with bndbox tags which where mixed up (ymin, xmin,...) or any other strange combinations, so this code read the tags not only the position.
Finally I updated the code. Thanks to craq and Pritesh Gohil, you were absolutely right.
Hope it helps...
import xml.etree.ElementTree as ETdef read_content(xml_file: str): tree = ET.parse(xml_file) root = tree.getroot() list_with_all_boxes = [] for boxes in root.iter('object'): filename = root.find('filename').text ymin, xmin, ymax, xmax = None, None, None, None ymin = int(boxes.find("bndbox/ymin").text) xmin = int(boxes.find("bndbox/xmin").text) ymax = int(boxes.find("bndbox/ymax").text) xmax = int(boxes.find("bndbox/xmax").text) list_with_single_boxes = [xmin, ymin, xmax, ymax] list_with_all_boxes.append(list_with_single_boxes) return filename, list_with_all_boxesname, boxes = read_content("file.xml")