AWS Lambda Container Running Selenium With Headless Chrome Works Locally But Not In AWS Lambda AWS Lambda Container Running Selenium With Headless Chrome Works Locally But Not In AWS Lambda selenium selenium

AWS Lambda Container Running Selenium With Headless Chrome Works Locally But Not In AWS Lambda


Python v3.6 works great. I have a bin directory with chromedriver v2.41 (https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip) and headless-chrome v68.0.3440.84 (https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-53/stable-headless-chromium-amazonlinux-2017-03.zip).

Below is my Dockerfile where I copy chromedriver and headless-chrome from source bin directory to the destination bin directory. The reason for having the destination bin directory is mentioned below.

FROM public.ecr.aws/lambda/python:3.6COPY app.py ${LAMBDA_TASK_ROOT}COPY requirements.txt ${LAMBDA_TASK_ROOT}RUN --mount=type=cache,target=/root/.cache/pip python3.6 -m pip install --upgrade pipRUN --mount=type=cache,target=/root/.cache/pip python3.6 -m pip install -r requirements.txtRUN mkdir binADD bin bin/CMD [ "app.handler" ]

In my python script, I will copy the files in bin directory (Docker Container) to /tmp/bin directory (Amazon Linux 2) with 775 permission because tmp is the only directory where we can write files in Amazon linux 2 as the lambda will be executed here.

BIN_DIR = "/tmp/bin"CURR_BIN_DIR = os.getcwd() + "/bin"def _init_bin(executable_name):    if not os.path.exists(BIN_DIR):        logger.info("Creating bin folder")        os.makedirs(BIN_DIR)    logger.info("Copying binaries for " + executable_name + " in /tmp/bin")    currfile = os.path.join(CURR_BIN_DIR, executable_name)    newfile = os.path.join(BIN_DIR, executable_name)    shutil.copy2(currfile, newfile)    logger.info("Giving new binaries permissions for lambda")    os.chmod(newfile, 0o775)

In the handler function, use the below options to avoid few exceptions raised by chrome driver.

def handler(event, context):    _init_bin("headless-chromium")    _init_bin("chromedriver")    options = Options()    options.add_argument("--headless")    options.add_argument("--disable-gpu")    options.add_argument("--no-sandbox")    options.add_argument('--disable-dev-shm-usage')    options.add_argument('--disable-gpu-sandbox')    options.add_argument("--single-process")    options.add_argument('window-size=1920x1080')    options.add_argument(        '"user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"')    options.binary_location = "/tmp/bin/headless-chromium"    browser = webdriver.Chrome(        "/tmp/bin/chromedriver", options=options)


I could run with 3.7 and 3.8 in AWS Lambda. You need to install specific versions. I don't get yet how to run the latest chrome tho. Please visit my repository.

https://github.com/umihico/docker-selenium-lambda/

So far, the latest versions I could find are below.

  • Python 3.8 (You need to install dependencies in docker image container)
  • serverless-chrome v1.0.0-37
  • chromedriver 2.37
  • selenium 3.141.0 (latest)


The solution from Sandeep Kumar works (upvoted it but not working as i am new user).

This is the minimal setup for running selenium in container-based lambda.

  1. Download and copy the binary files as Sandeep mentioned(chromedriver v2.41 and headless-chrome v68.0.3440.84) into bin folder

  2. requirements.txt

selenium==3.14.0
  1. Dockerfile (note: python 3.8 does not work)
FROM public.ecr.aws/lambda/python:3.6COPY app.py ${LAMBDA_TASK_ROOT}COPY requirements.txt ${LAMBDA_TASK_ROOT}RUN pip install --upgrade pipRUN pip install -r requirements.txtRUN mkdir binADD bin /bin/RUN chmod 755 /bin/chromedriverCMD [ "app.handler" ]
  1. app.py
from selenium import webdriverdef handler(event, context):    options = webdriver.ChromeOptions()    options.add_argument("--headless")    options.add_argument("--disable-gpu")    options.add_argument("--no-sandbox")    options.add_argument('--disable-dev-shm-usage')    options.add_argument('--disable-gpu-sandbox')    options.add_argument("--single-process")    options.add_argument('window-size=1920x1080')    options.add_argument(        '"user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"')    options.binary_location = "/bin/headless-chromium"    browser = webdriver.Chrome(        executable_path="/bin/chromedriver", options=options)    browser.get("https://feng.lu")    print(browser.title)    browser.quit()
  1. Have default IAM permission in AWS lambda is good enough

Note: I did not copy the content into /tmp/bin folder as Sandeep did, just use the bin folder, and I update the CHMOD permission inside the docker file.