UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte xml xml

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte


The problem looks like the file has characters represented with latin1 that aren't characters in utf8. The file utility can be useful for figuring out what encoding a file should be treated as, e.g:

monk@monk-VirtualBox:~$ file foo.txt foo.txt: UTF-8 Unicode text

Here's what the bytes mean in latin1:

>>> b'\xe2'.decode('latin1')'รข'

Probably easiest is to convert the files to utf8.


I also had the same problem rendering Markup("""yyyyyy""") but i solved it using an online tool with removed the 'bad' characters. https://pteo.paranoiaworks.mobi/diacriticsremover/

It is a nice tool and works even offline.