lxml.etree.XML ValueError for Unicode string
data = open(module_path+'/data/ex-fire.xslt')xslt_content = data.read()
This implicitly decodes the bytes in the file to Unicode text, using the default encoding. (This might give wrong results, if the XML file isn't in that encoding.)
xslt_root = etree.XML(xslt_content)
XML has its own handling and signalling for encodings, the <?xml encoding="..."?>
prolog. If you pass a Unicode string starting with <?xml encoding="..."?>
to a parser, the parser would like to reintrepret the rest of the byte string using that encoding... but can't, because you've already decoded the byte input to a Unicode string.
Instead, you should either pass the undecoded byte string to the parser:
data = open(module_path+'/data/ex-fire.xslt', 'rb')xslt_content = data.read()xslt_root = etree.XML(xslt_content)
or, better, just have the parser read straight from the file:
xslt_root = etree.parse(module_path+'/data/ex-fire.xslt')
You can also decode the UTF-8 string and encode it with ascii before passing it to etree.XML
xslt_content = data.read() xslt_content = xslt_content.decode('utf-8').encode('ascii') xslt_root = etree.XML(xslt_content)
I made it work by simply reencoding with the default options
xslt_content = data.read().encode()