Best output type and encoding practices for __repr__() functions?
In Python2, __repr__
(and __str__
) must return a string object, not aunicode object. In Python3, the situation is reversed, __repr__
and __str__
must return unicode objects, not byte (née string) objects:
class Foo(object): def __repr__(self): return u'\N{WHITE SMILING FACE}' class Bar(object): def __repr__(self): return u'\N{WHITE SMILING FACE}'.encode('utf8')repr(Bar())# ☺repr(Foo())# UnicodeEncodeError: 'ascii' codec can't encode character u'\u263a' in position 0: ordinal not in range(128)
In Python2, you don't really have a choice. You have to pick an encoding for thereturn value of __repr__
.
By the way, have you read the PrintFails wiki? It may not directly answeryour other questions, but I did find it helpful in illuminating why certainerrors occur.
When using from __future__ import unicode_literals
,
'<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8')
can be more simply written as
str('<{}>').format(repr(x))
assuming str
encodes to utf-8
on your system.
Without from __future__ import unicode_literals
, the expression can be written as:
'<{}>'.format(repr(x))
I think a decorator can manage __repr__
incompatibilities in a sane way. Here's what i use:
from __future__ import unicode_literals, print_functionimport sysdef force_encoded_string_output(func): if sys.version_info.major < 3: def _func(*args, **kwargs): return func(*args, **kwargs).encode(sys.stdout.encoding or 'utf-8') return _func else: return funcclass MyDummyClass(object): @force_encoded_string_output def __repr__(self): return 'My Dummy Class! \N{WHITE SMILING FACE}'
I use a function like the following:
def stdout_encode(u, default='UTF8'): if sys.stdout.encoding: return u.encode(sys.stdout.encoding) return u.encode(default)
Then my __repr__
functions look like this:
def __repr__(self): return stdout_encode(u'<MyClass {0} {1}>'.format(self.abcd, self.efgh))