UnicodeDecodeError while using json.dumps() [duplicate] UnicodeDecodeError while using json.dumps() [duplicate] json json

UnicodeDecodeError while using json.dumps() [duplicate]


\xe1 is not decodable using utf-8, utf-16 encoding.

>>> '\xe1'.decode('utf-8')Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode    return codecs.utf_8_decode(input, errors, True)UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data>>> '\xe1'.decode('utf-16')Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode    return codecs.utf_16_decode(input, errors, True)UnicodeDecodeError: 'utf16' codec can't decode byte 0xe1 in position 0: truncated data

Try latin-1 encoding:

>>> record = (5790, 'Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ',...           60, True, '40141613')>>> json.dumps(record, encoding='latin1')'[5790, "Vlv-Gate-Assy-Mdl-\\u00e1M1-2-\\u00e19/16-10K-BB Credit Memo            ", 60, true, "40141613"]'

Or, specify ensure_ascii=False, json.dumps to make json.dumps not try to decode the string.

>>> json.dumps(record, ensure_ascii=False)'[5790, "Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ", 60, true, "40141613"]'


I had a similar problem, and came up with the following approach to either guarantee unicodes or byte strings, from either input. In short, include and use the following lambdas:

# guarantee unicode string_u = lambda t: t.decode('UTF-8', 'replace') if isinstance(t, str) else t_uu = lambda *tt: tuple(_u(t) for t in tt) # guarantee byte string in UTF8 encoding_u8 = lambda t: t.encode('UTF-8', 'replace') if isinstance(t, unicode) else t_uu8 = lambda *tt: tuple(_u8(t) for t in tt)

Applied to your question:

import jsono = (5790, u"Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ", 60, True, '40141613')as_json = json.dumps(_uu8(*o))as_obj = json.loads(as_json)print "object\n ", oprint "json (type %s)\n %s " % (type(as_json), as_json)print "object again\n ", as_obj

=>

object  (5790, u'Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ', 60, True, '40141613')json (type <type 'str'>)  [5790, "Vlv-Gate-Assy-Mdl-\u00e1M1-2-\u00e19/16-10K-BB Credit Memo            ", 60, true, "40141613"]object again  [5790, u'Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ', 60, True, u'40141613']

Here's some more reasoning about this.