SQLAlchemy/MySQL binary blob is being utf-8 encoded? SQLAlchemy/MySQL binary blob is being utf-8 encoded? python-3.x python-3.x

SQLAlchemy/MySQL binary blob is being utf-8 encoded?


Turns out that this was a driver issue. Apparently the default MySQL driver stumbles with Py3 and utf8 support. Installing cymysql into the virtual Python environment resolved this problem and the warnings disappear.

The fix: Find out if MySQL connects through socket or port (see here), and then modify the connection string accordingly. In my case using a socket connection:

mysql+cymysql://user:pwd@localhost/database?unix_socket=/var/run/mysqld/mysqld.sock

Use the port argument otherwise.

Edit: While the above fixed the encoding issue, it gave rise to another one: blob size. Due to a bug in CyMySQL blobs larger than 8M fail to commit. Switching to PyMySQL fixed that problem, although it seems to have a similar issue with large blobs.


Not sure, but your problem might have the same roots as the one I had several years ago in python 2.7: https://stackoverflow.com/a/9535736/68998. In short, Mysql's interface does not let you be certain if you are working with a true binary string or a text in a binary collation (used because of a lack of case-sensitive utf8 collation). Therefore, a Mysql binding has the following options:

  • return all string fields as binary strings, and leave the decoding to you
  • decode only the fields that do not have a binary flag (so much fun when some of the fields are unicode and other are str)
  • have an option to force decoding to unicode for all string fields, even true binary

My guess is that in your case, the third option is somewhere enabled in the underlying Mysql binding. And the first suspect is your connection string (connection params).