conda environment to AWS Lambda conda environment to AWS Lambda python python

conda environment to AWS Lambda


You should use the serverless framework in combination with the serverless-python-requirements plugin. You just need a requirements.txt and the plugin automatically packages your code and the dependencies in a zip-file, uploads everything to s3 and deploys your function. Bonus: Since it can do this dockerized, it is also able to help you with packages that need binary dependencies.

Have a look here (https://serverless.com/blog/serverless-python-packaging/) for a how-to.

From experience I strongly recommend you look into that. Every bit of manual labour for deployment and such is something that keeps you from developing your logic.

Edit 2017-12-17:

Your comment makes sense @eelco-hoogendoorn.

However, in my mind a conda environment is just an encapsulated place where a bunch of python packages live. So, if you would put all these dependencies (from your conda env) into a requirements.txt (and use serverless + plugin) that would solve your problem, no?
IMHO it would essentially be the same as zipping all the packages you installed in your env into your deployment package. That being said, here is a snippet, which does essentially this:

conda env export --name Name_of_your_Conda_env | yq -r '.dependencies[] | .. | select(type == "string")' | sed -E "s/(^[^=]*)(=+)([0-9.]+)(=.*|$)/\1==\3/" > requirements.txt

Unfortunately conda env export only exports the environment in yaml format. The --json flag doesn't work right now, but is supposed to be fixed in the next release. That is why I had to use yq instead of jq. You can install yq using pip install yq. It is just a wrapper around jq to allow it to also work with yaml files.

KEEP IN MIND

Lambda deployment code can only be 50MB in size. Your environment shouldn't be too big.

I have not tried deploying a lambda with serverless + serverless-python-packaging and a requirements.txt created like that and I don't know if it will work.


The main reason why I use conda is an option not to compile different binary packages myself (like numpy, matplotlib, pyqt, etc.) or compile them less frequently. When you do need to compile something yourself for the specific version of python (like uwsgi), you should compile the binaries with the same gcc version that the python within your conda environment is compiled with - most probably it is not the same gcc that your OS is using, since conda is now using the latest versions of the gcc that should be installed with conda install gxx_linux-64.

This leads us to two situations:

  1. All you dependencies are in pure python and you can actually save a list of list of them using pip freeze and bundle them as it is stated for virtualenv.

  2. You have some binary extensions. In that case, the the binaries from your conda environment will not work with the python used by AWS lambda. Unfortunately, you will need to visit the page describing the execution environment (AMI: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2), set up the environment, build the binaries for the specific version of built-in python in a separate directory (as well as pure python packages), and then bundle them into a zip-archive.

This is a general answer to your question, but the main idea is that you can not reuse your binary packages, only a list of them.


I can't think of a good reason why zipping up your conda environment wouldn't work.

I thik you can go into your anaconda2/envs/ or anaconda3/envs/ directory and copy/zip the env-directory you want to upload. Conda is just a souped-up version of a virtualenv, plus a different & somewhat optional package-manager. The big reason I think it's ok is that conda environments encapsulate all their dependencies within their particular .../anaconda[2|3]/envs/$VIRTUAL_ENV_DIR/ directories by default.

Using the normal virtualenv expression gives you a bit more freedom, in sort of the same way that cavemen had more freedom than modern people. Personally I prefer cars. With virtualenv you basically get a semi-empty $PYTHON_PATH variable that you can fill with whatever you want, rather than the more robust, pre-populated env that Conda spits out. The following is a good table for reference: https://conda.io/docs/commands.html#conda-vs-pip-vs-virtualenv-commands

Conda turns the command ~$ /path/to/$VIRTUAL_ENV_ROOT_DIR/bin/activate into ~$ source activate $VIRTUAL_ENV_NAME

Say you want to make a virtualenv the old fashioned way. You'd choose a directory (let's call it $VIRTUAL_ENV_ROOT_DIR,) & name (which we'll call $VIRTUAL_ENV_NAME.) At this point you would type:

~$ cd $VIRTUAL_ENV_ROOT_DIR && virtualenv $VIRTUAL_ENV_NAME

python then creates a copy of it's own interpreter library (plus pip and setuptools I think) & places an executable called activate in this clone's bin/ directory. The $VIRTUAL_ENV_ROOT_DIR/bin/activate script works by changing your current $PYTHONPATH environment variable, which determines what python interpreter gets called when you type ~$ python into the shell, & also the list of directories containing all modules which the interpreter will see when it is told to import something. This is the primary reason you'll see #!/usr/bin/env python in people's code instead of /usr/bin/python.