Python 3 CSV file giving UnicodeDecodeError: 'utf-8' codec can't decode byte error when I print Python 3 CSV file giving UnicodeDecodeError: 'utf-8' codec can't decode byte error when I print python-3.x python-3.x

Python 3 CSV file giving UnicodeDecodeError: 'utf-8' codec can't decode byte error when I print


We know the file contains the byte b'\x96' since it is mentioned in the error message:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 7386: invalid start byte

Now we can write a little script to find out if there are any encodings where b'\x96' decodes to ñ:

import pkgutilimport encodingsimport osdef all_encodings():    modnames = set([modname for importer, modname, ispkg in pkgutil.walk_packages(        path=[os.path.dirname(encodings.__file__)], prefix='')])    aliases = set(encodings.aliases.aliases.values())    return modnames.union(aliases)text = b'\x96'for enc in all_encodings():    try:        msg = text.decode(enc)    except Exception:        continue    if msg == 'ñ':        print('Decoding {t} with {enc} is {m}'.format(t=text, enc=enc, m=msg))

which yields

Decoding b'\x96' with mac_roman is ñDecoding b'\x96' with mac_farsi is ñDecoding b'\x96' with mac_croatian is ñDecoding b'\x96' with mac_arabic is ñDecoding b'\x96' with mac_romanian is ñDecoding b'\x96' with mac_iceland is ñDecoding b'\x96' with mac_turkish is ñ

Therefore, try changing

with open('my_file.csv', 'r', newline='') as csvfile:

to one of those encodings, such as:

with open('my_file.csv', 'r', encoding='mac_roman', newline='') as csvfile:


with open('my_file.csv', 'r', newline='', encoding='ISO-8859-1') as csvfile:

ñ character is not listed on UTC-8 encoding. To fix the issue, you may use ISO-8859-1 encoding instead. For more details about this encoding, you may refer to the link below:https://www.ic.unicamp.br/~stolfi/EXPORT/www/ISO-8859-1-Encoding.html


For others who hit the same error shown in the subject, watch out for the file encoding of your csv file. Its possible it is not utf-8. I just noticed that LibreOffice created a utf-16 encoded file for me today without prompting me although I could not reproduce this.

If you try to open a utf-16 encoded document using open(... encoding='utf-8'), you will get the error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

To fix either specify 'utf-16' encoding or change the encoding of the csv.