Lambda not supporting NLTK file size Lambda not supporting NLTK file size json json

Lambda not supporting NLTK file size


There are two things that you can do:

  1. The errors seems like the path is not being defined properly, maybe set it as an env Variable?

sys.path.append(os.path.abspath('/var/task/nltk_data/')

or this way

  1. Once you run nltk.download(), then copy it to the root folder of your AWS lambda application. (Name the dir to be called "nltk_data".)

  2. In the lambda function dashboard (in the AWS console), add NLTK_DATA=./nltk_data as a key-var Environment Variable.


  1. reduce the size of the nltk downloads, since you won't be needing all of them.

    1. Delete all the zip files, keep only the needed section, for example: stopwords. That can be moved into: save nltk_data/corpora/stopwords and delete the rest.

    2. Or If you need tokenizers save to nltk_data/tokenizers/punkt. Most of these can be separately downloaded: python -m nltk.downloader punkt, then copy over the files.