Unicode Encode Error when writing pandas df to csv
You have unicode
values in your DataFrame. Files store bytes, which means all unicode
have to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such as utf-8
. For example,
df.to_csv('path', header=True, index=False, encoding='utf-8')
If you don't specify an encoding, then the encoding used by df.to_csv
defaults to ascii
in Python2, or utf-8
in Python3.
Adding an answer to help myself google it later:
One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like:
df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))
This would get the dataframe to print correctly too.