Validating with an XML schema in Python Validating with an XML schema in Python python python

Validating with an XML schema in Python


I am assuming you mean using XSD files. Surprisingly there aren't many python XML libraries that support this. lxml does however. Check Validation with lxml. The page also lists how to use lxml to validate with other schema types.


An example of a simple validator in Python3 using the popular library lxml

Installation lxml

pip install lxml

If you get an error like "Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?", try to do this first:

# Debian/Ubuntuapt-get install python-dev python3-dev libxml2-dev libxslt-dev# Fedora 23+dnf install python-devel python3-devel libxml2-devel libxslt-devel

The simplest validator

Let's create simplest validator.py

from lxml import etreedef validate(xml_path: str, xsd_path: str) -> bool:    xmlschema_doc = etree.parse(xsd_path)    xmlschema = etree.XMLSchema(xmlschema_doc)    xml_doc = etree.parse(xml_path)    result = xmlschema.validate(xml_doc)    return result

then write and run main.py

from validator import validateif validate("path/to/file.xml", "path/to/scheme.xsd"):    print('Valid! :)')else:    print('Not valid! :(')

A little bit of OOP

In order to validate more than one file, there is no need to create an XMLSchema object every time, therefore:

validator.py

from lxml import etreeclass Validator:    def __init__(self, xsd_path: str):        xmlschema_doc = etree.parse(xsd_path)        self.xmlschema = etree.XMLSchema(xmlschema_doc)    def validate(self, xml_path: str) -> bool:        xml_doc = etree.parse(xml_path)        result = self.xmlschema.validate(xml_doc)        return result

Now we can validate all files in the directory as follows:

main.py

import osfrom validator import Validatorvalidator = Validator("path/to/scheme.xsd")# The directory with XML filesXML_DIR = "path/to/directory"for file_name in os.listdir(XML_DIR):    print('{}: '.format(file_name), end='')    file_path = '{}/{}'.format(XML_DIR, file_name)    if validator.validate(file_path):        print('Valid! :)')    else:        print('Not valid! :(')

For more options read here: Validation with lxml


As for "pure python" solutions: the package index lists:

  • pyxsd, the description says it uses xml.etree.cElementTree, which is not "pure python" (but included in stdlib), but source code indicates that it falls back to xml.etree.ElementTree, so this would count as pure python. Haven't used it, but according to the docs, it does do schema validation.
  • minixsv: 'a lightweight XML schema validator written in "pure" Python'. However, the description says "currently a subset of the XML schema standard is supported", so this may not be enough.
  • XSV, which I think is used for the W3C's online xsd validator (it still seems to use the old pyxml package, which I think is no longer maintained)