How can I create a Docker image to run both Python and R?
The Dockerfile I built for Python and R to run together with their dependencies in this manner is:
FROM ubuntu:latestENV DEBIAN_FRONTEND=noninteractiveRUN apt-get update && apt-get install -y --no-install-recommends build-essential r-base r-cran-randomforest python3.6 python3-pip python3-setuptools python3-devWORKDIR /appCOPY requirements.txt /app/requirements.txtRUN pip3 install -r requirements.txtRUN Rscript -e "install.packages('data.table')"COPY . /app
The commands to build the image, run the container (naming it SnakeR here), and execute the code are:
docker build -t my_image .docker run -it --name SnakeR my_imagedocker exec SnakeR /bin/sh -c "python3 test_call_r.py"
I treated it like a Ubuntu OS and built the image as follows:
- suppress the prompts for choosing your location during the R install;
- update the apt-get;
- set installation criteria of:
- y = yes to user prompts for proceeding (e.g. memory allocation);
- install only the recommended, not suggested, dependencies;
- include some essential installation packages for Ubuntu;
- r-base for the R software;
- r-cran-randomforest to force the package to be available (unlike the separate install of data.table which didn’t work for randomForest for some reason);
- python3.6 version of python;
- python3-pip to allow pip be used to install the requirements;
- python3-setuptools to somehow help execute the pip installs (?!);
- python3-dev to execute the JayDeBeApi installation as part of the requirements (that it otherwise confuses is for Python2 not 3);
- specify the active “working directory” to be the /app location;
- copy the requirements file that holds the python dependencies (built from the virtual environment of the Python codebase, e.g., with pip freeze);
- install the Python packages from the requirements file (pip3 for Python3);
- install the R packages (e.g. just data.table here);
- copy the directory contents to the specified working directory /app.
This is replicated from my blog post at https://datascienceunicorn.tumblr.com/post/182297983466/building-a-docker-to-run-python-r
Being specific on both Python and R versions will save you future headaches. This approach, for instance, will always install R v4.0 and Python v3.8
FROM r-base:4.0.3ENV DEBIAN_FRONTEND=noninteractiveRUN apt-get update && apt-get install -y --no-install-recommends build-essential libpq-dev python3.8 python3-pip python3-setuptools python3-devRUN pip3 install --upgrade pipENV PYTHONPATH "${PYTHONPATH}:/app"WORKDIR /appADD requirements.txt .ADD requirements.r .# installing python librariesRUN pip3 install -r requirements.txt# installing r librariesRUN Rscript requirements.r
And your requirements.r file should look like
install.packages('data.table')install.packages('jsonlite')...
I made an image for my personal projects, you could use this if you want: https://github.com/dipayan90/docker-python-r