Python eval: is it still dangerous if I disable builtins and attribute access? Python eval: is it still dangerous if I disable builtins and attribute access? python python

Python eval: is it still dangerous if I disable builtins and attribute access?


I'm going to mention one of the new features of Python 3.6 - f-strings.

They can evaluate expressions,

>>> eval('f"{().__class__.__base__}"', {'__builtins__': None}, {})"<class 'object'>"

but the attribute access won't be detected by Python's tokenizer:

0,0-0,0:            ENCODING       'utf-8'        1,0-1,1:            ERRORTOKEN     "'"            1,1-1,27:           STRING         'f"{().__class__.__base__}"'2,0-2,0:            ENDMARKER      '' 


It is possible to construct a return value from eval that would throw an exception outside eval if you tried to print, log, repr, anything:

eval('''((lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args))))        (lambda f: lambda n: (1,(1,(1,(1,f(n-1))))) if n else 1)(300))''')

This creates a nested tuple of form (1,(1,(1,(1...; that value cannot be printed (on Python 3), stred or repred; all attempts to debug it would lead to

RuntimeError: maximum recursion depth exceeded while getting the repr of a tuple

pprint and saferepr fails too:

...  File "/usr/lib/python3.4/pprint.py", line 390, in _safe_repr    orepr, oreadable, orecur = _safe_repr(o, context, maxlevels, level)  File "/usr/lib/python3.4/pprint.py", line 340, in _safe_repr    if issubclass(typ, dict) and r is dict.__repr__:RuntimeError: maximum recursion depth exceeded while calling a Python object

Thus there is no safe built-in function to stringify this: the following helper could be of use:

def excsafe_repr(obj):    try:        return repr(obj)    except:        return object.__repr__(obj).replace('>', ' [exception raised]>')

And then there is the problem that print in Python 2 does not actually use str/repr, so you do not have any safety due to lack of recursion checks. That is, take the return value of the lambda monster above, and you cannot str, repr it, but ordinary print (not print_function!) prints it nicely. However, you can exploit this to generate a SIGSEGV on Python 2 if you know it will be printed using the print statement:

print eval('(lambda i: [i for i in ((i, 1) for j in range(1000000))][-1])(1)')

crashes Python 2 with SIGSEGV. This is WONTFIX in the bug tracker. Thus never use print-the-statement if you want to be safe. from __future__ import print_function!


This is not a crash, but

eval('(1,' * 100 + ')' * 100)

when run, outputs

s_push: parser stack overflowTraceback (most recent call last):  File "yyy.py", line 1, in <module>    eval('(1,' * 100 + ')' * 100)MemoryError

The MemoryError can be caught, is a subclass of Exception. The parser has some really conservative limits to avoid crashes from stackoverflows (pun intended). However, s_push: parser stack overflow is output to stderr by C code, and cannot be suppressed.


And just yesterday I asked why doesn't Python 3.4 be fixed for a crash from,

% python3  Python 3.4.3 (default, Mar 26 2015, 22:03:40) [GCC 4.9.2] on linuxType "help", "copyright", "credits" or "license" for more information.>>> class A:...     def f(self):...         nonlocal __x... [4]    19173 segmentation fault (core dumped)  python3

and Serhiy Storchaka's answer confirmed that Python core devs do not consider SIGSEGV on seemingly well-formed code a security issue:

Only security fixes are accepted for 3.4.

Thus it can be concluded that it can never be considered safe to execute any code from 3rd party in Python, sanitized or not.

And Nick Coghlan then added:

And as some additional background as to why segmentation faults provoked by Python code aren't currently considered a security bug: since CPython doesn't include a security sandbox, we're already relying entirely on the OS to provide process isolation. That OS level security boundary isn't affected by whether the code is running "normally", or in a modified state following a deliberately triggered segmentation fault.


Users can still DoS you by inputting an expression that evaluates to a huge number, which would fill your memory and crash the Python process, for example

'10**10**100'

I am definitely still curious if more traditional attacks, like recovering builtins or creating a segfault, are possible here.

EDIT:

It turns out, even Python's parser has this issue.

lambda: 10**10**100

will hang, because it tries to precompute the constant.