Chrome --headless for AWS Lambda? Chrome --headless for AWS Lambda? google-chrome google-chrome

Chrome --headless for AWS Lambda?


Yes; it's possible.

Compiling a non-debug build of Headless Chrome yields a binary that's ~125 MB, and just under 44 MB when gzipped. This means it fits within the 250 MB uncompressed and 50 MB size limitation for the function's deployment package.

What's (currently) required is to force Chrome to compile without using shared memory at /dev/shm. Theres a thread on the topic on the headless-dev google group here.

Here are steps I've used to build a binary of headless Chrome that will work on AWS Lambda. They're based on this and this.

  1. Create a new EC2 instance using the community AMI with name amzn-ami-hvm-2016.03.3.x86_64-gp2 (us-west-2 ami-7172b611).
  2. Pick an Instance Type with at least 16 GB of memory. Compile time will take about 4-5 hours on a t2.xlarge, or 2-3ish on a t2.2xlarge or about 45 min on a c4.4xlarge.
  3. Give yourself a Root Volume that's at least 30 GB (40 GB if you want to compile a debug build—which you won't be able to upload to Lambda because it's too big.)
  4. SSH into the new instance and run:
sudo printf "LANG=en_US.utf-8\nLC_ALL=en_US.utf-8" >> /etc/environmentsudo yum install -y git redhat-lsb python bzip2 tar pkgconfig atk-devel alsa-lib-devel bison binutils brlapi-devel bluez-libs-devel bzip2-devel cairo-devel cups-devel dbus-devel dbus-glib-devel expat-devel fontconfig-devel freetype-devel gcc-c++ GConf2-devel glib2-devel glibc.i686 gperf glib2-devel gtk2-devel gtk3-devel java-1.*.0-openjdk-devel libatomic libcap-devel libffi-devel libgcc.i686 libgnome-keyring-devel libjpeg-devel libstdc++.i686 libX11-devel libXScrnSaver-devel libXtst-devel libxkbcommon-x11-devel ncurses-compat-libs nspr-devel nss-devel pam-devel pango-devel pciutils-devel pulseaudio-libs-devel zlib.i686 httpd mod_ssl php php-cli python-psutil wdiff --enablerepo=epel

Yum will complain about some packages not existing. Whatever. I haven't looked into them. Didn't seem to stop me from building headless_shell, though. Ignore whiney little Yum and move on. Next:

git clone https://chromium.googlesource.com/chromium/tools/depot_tools.gitecho "export PATH=$PATH:$HOME/depot_tools" >> ~/.bash_profilesource ~/.bash_profilemkdir Chromium && cd Chromiumfetch --no-history chromiumcd src

At this point we need to make a very small change to the Chrome code. By default on Linux, Chrome assumes there to be a tmpfs at /dev/shm. There is no tmpfs available to a Lambda function. :-(

The file we have to change is src/base/files/file_util_posix.cc. Modify GetShmemTempDir() such that it always returns the OSs temp dir (/tmp). A simple way to do this is to just remove the entire #if defined(OS_LINUX) block in the GetShmemTempDir() function. A less drastic change is to hardcode use_dev_shm to false:

bool GetShmemTempDir(bool executable, FilePath* path) {#if defined(OS_LINUX)  bool use_dev_shm = true;  if (executable) {    static const bool s_dev_shm_executable = DetermineDevShmExecutable();    use_dev_shm = s_dev_shm_executable;  }// cuz lambdause_dev_shm = false; // <-- add this. Yes it's pretty hack-y  if (use_dev_shm) {    *path = FilePath("/dev/shm");    return true;  }#endif  return GetTempDir(path);}

With that change, it's time to compile. Picking things back up in the src directory, set some compile arguments and then (the last command) start the build process.

mkdir -p out/Headlessecho 'import("//build/args/headless.gn")' > out/Headless/args.gnecho 'is_debug = false' >> out/Headless/args.gnecho 'symbol_level = 0' >> out/Headless/args.gnecho 'is_component_build = false' >> out/Headless/args.gnecho 'remove_webcore_debug_symbols = true' >> out/Headless/args.gnecho 'enable_nacl = false' >> out/Headless/args.gngn gen out/Headlessninja -C out/Headless headless_shell

Finally we make a tarball of the relevant file(s) we'll need to run in Lambda.

mkdir out/headless-chrome && cd outcp Headless/headless_shell Headless/libosmesa.so headless-chrome/tar -zcvf chrome-headless-lambda-linux-x64.tar.gz headless-chrome/

Within Lambda, run headless_shell with the remote debugger interface enabled by executing:

/path/to/headless_shell --disable-gpu --no-sandbox --remote-debugging-port=9222 --user-data-dir=/tmp/user-data --single-process --data-path=/tmp/data-path --homedir=/tmp --disk-cache-dir=/tmp/cache-dir

Since /tmp is the only writeable place in a Lambda function, there are a bunch of flags just telling Chrome where to dump it's data. They're not necessary but it keeps Chrome happy. Note also that it's been mentioned that with the --disable-gpu flag, we don't need libosmesa.so, the omission of which would shave off about 4 MB from our package zip.

I've started this project with the aim of making it easier to get started. It comes with a pre-built headless Chrome binary which you can get here.