Pandas in AWS lambda gives numpy error Pandas in AWS lambda gives numpy error python python

Pandas in AWS lambda gives numpy error


EDIT: I figured out finally how to run pandas & numpy in a AWS Lambda python 3.6 runtime environment.

I have uploaded my deployment package to the following repo:

git clone https://github.com/pbegle/aws-lambda-py3.6-pandas-numpy.git

Simply add your lambda_function.py to the zip file by running:

zip -ur lambda.zip lambda_function.py

Upload to S3 and source to lambda.

ORIGINAL:

The only way I have gotten Pandas to work in a lambda function is by compiling the pandas (and numpy) libraries in an AWS Linux EC2 instance following the steps from this blog post and then using the python 2.7 runtime for my lambda function.


After doing a lot of research I was able to make it work with Lambda layers.

Create or open a clean directory and follow the steps below:

Prerequisites: Make sure you have Docker up and running

  1. Create a requirements.txt file with the following:
pandas==0.23.4pytz==2018.7
  1. Create a get_layer_packages.sh file with the following:
#!/bin/bashexport PKG_DIR="python"rm -rf ${PKG_DIR} && mkdir -p ${PKG_DIR}docker run --rm -v $(pwd):/foo -w /foo lambci/lambda:build-python3.6 \    pip install -r requirements.txt --no-deps -t ${PKG_DIR}
  1. Run the following commands in the same directory:
chmod +x get_layer_packages.sh./get_layer_packages.shzip -r pandas.zip .
  1. Upload the layer to a S3 bucket.

  2. Upload the layer to AWS by running the command below:

aws lambda publish-layer-version --layer-name pandas-layer --description "Description of your layer"--content S3Bucket=<bucket name>,S3Key=<layer-name>.zip--compatible-runtimes python3.6 python3.7
  1. Go to Lambda console and upload your code as a zip file or use the inline editor.

  2. Click on Layers > Add a layer> Search for the layer (pandas-layer) from the Compatible layers and select the version.

  3. Also add the AWSLambda-Python36-SciPy1x layer which is available by default for importing numpy.

Selecting the layer from the console

  1. Test the code. It should work now!!!!

Thanks to this medium article https://medium.com/@qtangs/creating-new-aws-lambda-layer-for-python-pandas-library-348b126e9f3e


To include numpy in your lambda zip follow the instructions on this page in the AWS docs...

How do I add Python packages with compiled binaries to my deployment package and make the package compatible with AWS Lambda?

To paraphrase the instructions using numpy as an example:

  1. Open the module pages at pypi.org. https://pypi.org/project/numpy/
  2. Choose Download files.

  3. Download:

For Python 2.7, module-name-version-cp27-cp27mu-manylinux1_x86_64.whl

e.g. numpy-1.15.2-cp27-cp27m-manylinux1_x86_64.whl

For Python 3.6, module-name-version-cp36-cp36m-manylinux1_x86_64.whl

e.g. numpy-1.15.2-cp36-cp36m-manylinux1_x86_64.whl

  1. Uncompress the wheel file on the /path/to/project-dir folder.You can use the unzip command on the command line to do this. There are other ways obviously.

unzip numpy-1.15.2-cp36-cp36m-manylinux1_x86_64.whl

When the wheel file is uncompressed, your deployment package will be compatible with Lambda.

Hope that all makes sense ;)

The end result might look something like this.Note: you should not include the whl file in the deployment package.

What it might look like