Python and hebrew encoding/decoding error Python and hebrew encoding/decoding error sqlite sqlite

Python and hebrew encoding/decoding error


You are passing the fabricated names into the string formatting parameter for a Unicode string. Ideally, the strings passed this way should also be Unicode.

But fabricate_hebrew_name isn't returning Unicode - it is returned UTF-8 encoded string, which isn't the same.

So, get rid of the call the encode('utf-8') and see whether that helps.

The next question is what type runsql is expecting. If it is expecting Unicode, no problem. If it is expecting an ASCII-encoded string, then you will have problems because the Hebrew is not ASCII. In the unlikely case it is expecting a UTF-8 encoded-string, then that is the time to convert it - after the substitution is done.

In another answer, Ignacio Vazquez-Abrams warns against string interpolation in queries. The concept here is that instead of doing the string substitution, using the % operator, you should generally use a parameterised query, and pass the Hebrew strings as parameters to it. This may have some advantages in query optimisation and security against SQL injection.

Example

# -*- coding: utf-8 -*-import sqlite3# create db in memoryconn = sqlite3.connect(":memory:")cur = conn.cursor()cur.execute("CREATE TABLE personal ("            "id INTEGER PRIMARY KEY,"            "name VARCHAR(42) NOT NULL)")# insert random nameimport randomfabricate_hebrew_name = lambda: random.choice([    u'ירדן',u'יפה',u'תמי',u'ענת', u'רבקה',u'טלי',u'גינה',u'דנה',u'ימית',    u'אלונה',u'אילן',u'אדם',u'חווה'])cur.execute("INSERT INTO personal VALUES("            "NULL, :name)", dict(name=fabricate_hebrew_name()))conn.commit()id, name = cur.execute("SELECT * FROM personal").fetchone()print id, name# -> 1 אלונה