AWS Lambda Container Running Selenium With Headless Chrome Works Locally But Not In AWS Lambda
Python v3.6 works great. I have a bin
directory with chromedriver v2.41
(https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip) and headless-chrome v68.0.3440.84
(https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-53/stable-headless-chromium-amazonlinux-2017-03.zip).
Below is my Dockerfile where I copy chromedriver
and headless-chrome
from source bin
directory to the destination bin
directory. The reason for having the destination bin
directory is mentioned below.
FROM public.ecr.aws/lambda/python:3.6COPY app.py ${LAMBDA_TASK_ROOT}COPY requirements.txt ${LAMBDA_TASK_ROOT}RUN --mount=type=cache,target=/root/.cache/pip python3.6 -m pip install --upgrade pipRUN --mount=type=cache,target=/root/.cache/pip python3.6 -m pip install -r requirements.txtRUN mkdir binADD bin bin/CMD [ "app.handler" ]
In my python script, I will copy the files in bin
directory (Docker Container) to /tmp/bin
directory (Amazon Linux 2) with 775
permission because tmp
is the only directory where we can write files in Amazon linux 2 as the lambda will be executed here.
BIN_DIR = "/tmp/bin"CURR_BIN_DIR = os.getcwd() + "/bin"def _init_bin(executable_name): if not os.path.exists(BIN_DIR): logger.info("Creating bin folder") os.makedirs(BIN_DIR) logger.info("Copying binaries for " + executable_name + " in /tmp/bin") currfile = os.path.join(CURR_BIN_DIR, executable_name) newfile = os.path.join(BIN_DIR, executable_name) shutil.copy2(currfile, newfile) logger.info("Giving new binaries permissions for lambda") os.chmod(newfile, 0o775)
In the handler
function, use the below options to avoid few exceptions raised by chrome driver.
def handler(event, context): _init_bin("headless-chromium") _init_bin("chromedriver") options = Options() options.add_argument("--headless") options.add_argument("--disable-gpu") options.add_argument("--no-sandbox") options.add_argument('--disable-dev-shm-usage') options.add_argument('--disable-gpu-sandbox') options.add_argument("--single-process") options.add_argument('window-size=1920x1080') options.add_argument( '"user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"') options.binary_location = "/tmp/bin/headless-chromium" browser = webdriver.Chrome( "/tmp/bin/chromedriver", options=options)
I could run with 3.7 and 3.8 in AWS Lambda. You need to install specific versions. I don't get yet how to run the latest chrome tho. Please visit my repository.
https://github.com/umihico/docker-selenium-lambda/
So far, the latest versions I could find are below.
- Python 3.8 (You need to install dependencies in docker image container)
- serverless-chrome v1.0.0-37
- chromedriver 2.37
- selenium 3.141.0 (latest)
The solution from Sandeep Kumar works (upvoted it but not working as i am new user).
This is the minimal setup for running selenium in container-based lambda.
Download and copy the binary files as Sandeep mentioned(chromedriver v2.41 and headless-chrome v68.0.3440.84) into bin folder
requirements.txt
selenium==3.14.0
- Dockerfile (note: python 3.8 does not work)
FROM public.ecr.aws/lambda/python:3.6COPY app.py ${LAMBDA_TASK_ROOT}COPY requirements.txt ${LAMBDA_TASK_ROOT}RUN pip install --upgrade pipRUN pip install -r requirements.txtRUN mkdir binADD bin /bin/RUN chmod 755 /bin/chromedriverCMD [ "app.handler" ]
- app.py
from selenium import webdriverdef handler(event, context): options = webdriver.ChromeOptions() options.add_argument("--headless") options.add_argument("--disable-gpu") options.add_argument("--no-sandbox") options.add_argument('--disable-dev-shm-usage') options.add_argument('--disable-gpu-sandbox') options.add_argument("--single-process") options.add_argument('window-size=1920x1080') options.add_argument( '"user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"') options.binary_location = "/bin/headless-chromium" browser = webdriver.Chrome( executable_path="/bin/chromedriver", options=options) browser.get("https://feng.lu") print(browser.title) browser.quit()
- Have default IAM permission in AWS lambda is good enough
Note: I did not copy the content into /tmp/bin folder as Sandeep did, just use the bin folder, and I update the CHMOD permission inside the docker file.