Python "string_escape" vs "unicode_escape"
According to my interpretation of the implementation of unicode-escape
and the unicode repr
in the CPython 2.6.5 source, yes; the only difference between repr(unicode_string)
and unicode_string.encode('unicode-escape')
is the inclusion of wrapping quotes and escaping whichever quote was used.
They are both driven by the same function, unicodeescape_string
. This function takes a parameter whose sole function is to toggle the addition of the wrapping quotes and escaping of that quote.
Within the range 0 ≤ c < 128, yes the '
is the only difference for CPython 2.6.
>>> set(unichr(c).encode('unicode_escape') for c in range(128)) - set(chr(c).encode('string_escape') for c in range(128))set(["'"])
Outside of this range the two types are not exchangeable.
>>> '\x80'.encode('string_escape')'\\x80'>>> '\x80'.encode('unicode_escape')Traceback (most recent call last): File "<stdin>", line 1, in <module>UnicodeDecodeError: 'ascii' codec can’t decode byte 0x80 in position 0: ordinal not in range(128)>>> u'1'.encode('unicode_escape')'1'>>> u'1'.encode('string_escape')Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: escape_encode() argument 1 must be str, not unicode
On Python 3.x, the string_escape
encoding no longer exists, since str
can only store Unicode.