Flask - headers are not converted to unicode? Flask - headers are not converted to unicode? flask flask

Flask - headers are not converted to unicode?


At http://flask.pocoo.org/docs/api/#flask.request we read

The request object is an instance of a Request subclass and provides all of the attributes Werkzeug defines.

The word Request links to http://werkzeug.pocoo.org/docs/wrappers/#werkzeug.wrappers.Request where we read

The Request and Response classes subclass the BaseRequest and BaseResponse classes and implement all the mixins Werkzeug provides:

The word BaseRequest links to http://werkzeug.pocoo.org/docs/wrappers/#werkzeug.wrappers.BaseRequest where we read

headers
The headers from the WSGI environ as immutable EnvironHeaders.

The word EnvironHeaders links to http://werkzeug.pocoo.org/docs/datastructures/#werkzeug.datastructures.EnvironHeaders where we read

This provides the same interface as Headers and is constructed from a WSGI environment.

The word Headers is... no, it's not linked but it should has been linked to http://werkzeug.pocoo.org/docs/datastructures/#werkzeug.datastructures.Headers where we read

Headers is mostly compatible with the Python wsgiref.headers.Headers class

where the phrase wsgiref.headers.Headers links to http://docs.python.org/dev/library/wsgiref.html#wsgiref.headers.Headers where we read

Create a mapping-like object wrapping headers, which must be a list of header name/value tuples as described in PEP 3333.

The phrase PEP 3333 links to http://www.python.org/dev/peps/pep-3333/ where there's no explicit definition of what type headers should be but after searching for word headers for a while we find this statement

WSGI therefore defines two kinds of "string":

"Native" strings (which are always implemented using the type named str)that are used for request/response headers and metadata"Bytestrings" (which are implemented using the `bytes` type in Python 3,and `str` elsewhere), that are used for the bodies of requests andresponses (e.g. POST/PUT input data and HTML page outputs).

That's why in Python 2 you get headers as str not unicode.

Now let's move to decoding.

Neither your .decode('utf-8') nor mensi's .decode('ascii') (nor blindly expecting any other encoding) is universally good because In theory, HTTP header field values can transport anything; the tricky part is to get all parties (sender, receiver, and intermediates) to agree on the encoding.. Having said that I think you should act according to Julian Reshke's advice

Thus, the safe way to do this is to stick to ASCII, and choose an encoding on top of that, such as the one defined in RFC 5987.

after checking that User Agents (browsers) you support have implemented it.

Title of RFC 5987 is Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters


Header values are ASCII, see the linked questions by Acorn.

What you can do here is either decode it manually like you did (although you should use uuid.decode('ascii') and not utf-8) or change your field to be RawStr instead of Unicode