Serialize in JSON a base64 encoded data Serialize in JSON a base64 encoded data python-3.x python-3.x

Serialize in JSON a base64 encoded data


You must be careful about the datatypes.

If you read a binary image, you get bytes.If you encode these bytes in base64, you get ... bytes again! (see documentation on b64encode)

json can't handle raw bytes, that's why you get the error.

I have just written some example, with comments, I hope it helps:

from base64 import b64encodefrom json import dumpsENCODING = 'utf-8'IMAGE_NAME = 'spam.jpg'JSON_NAME = 'output.json'# first: reading the binary stuff# note the 'rb' flag# result: byteswith open(IMAGE_NAME, 'rb') as open_file:    byte_content = open_file.read()# second: base64 encode read data# result: bytes (again)base64_bytes = b64encode(byte_content)# third: decode these bytes to text# result: string (in utf-8)base64_string = base64_bytes.decode(ENCODING)# optional: doing stuff with the data# result here: some dictraw_data = {IMAGE_NAME: base64_string}# now: encoding the data to json# result: stringjson_data = dumps(raw_data, indent=2)# finally: writing the json string to disk# note the 'w' flag, no 'b' needed as we deal with text herewith open(JSON_NAME, 'w') as another_open_file:    another_open_file.write(json_data)


Alternative solution would be encoding stuff on the fly with a custom encoder:

import jsonfrom base64 import b64encodeclass Base64Encoder(json.JSONEncoder):    # pylint: disable=method-hidden    def default(self, o):        if isinstance(o, bytes):            return b64encode(o).decode()        return json.JSONEncoder.default(self, o)

Having that defined you can do:

m = {'key': b'\x9c\x13\xff\x00'}json.dumps(m, cls=Base64Encoder)

It will produce:

'{"key": "nBP/AA=="}'


What am I missing?

The error is yelling that a binary is not JSON serializable.

from base64 import b64encode# *binary representation* of the base64 stringassert b64encode(b"binary content")                 == b'YmluYXJ5IGNvbnRlbnQ='# base64 stringassert b64encode(b"binary content").decode('utf-8') ==  'YmluYXJ5IGNvbnRlbnQ='

The latter is definitely "JSON serializable" because is the base64 string representation of the binary b"binary content".