How to persist an R workspace across multiple CMD statements in a Dockerfile? How to persist an R workspace across multiple CMD statements in a Dockerfile? docker docker

How to persist an R workspace across multiple CMD statements in a Dockerfile?


If you want to make the data path configurable via an environment variable, then I would recommend to access that variable in your script with Sys.getenv(). This also allows you to use Rscript instead of R -e "source....

Here is what worked for me:

script.R

cat(Sys.getenv('SCRIPT'), '\n');cat(Sys.getenv('DATA'), '\n')

Dockerfile

FROM r-base:latestENV SCRIPT="script.R"ENV DATA="data.csv"WORKDIR /workspaceCMD R -q -e "source('$SCRIPT')"# alternative: CMD Rscript $SCRIPT

Usage

daniel@nuest /tmp/stackoverflow []$ docker build --tag stackoverflow .Sending build context to Docker daemon  4.608kBStep 1/5 : FROM r-base:latest ---> 46edce0e80afStep 2/5 : ENV SCRIPT="script.R" ---> Using cache ---> 8f26d34d9c0aStep 3/5 : ENV DATA="data.csv" ---> Using cache ---> 16c83c16a4c8Step 4/5 : WORKDIR /workspace ---> Running in fce8619af30bRemoving intermediate container fce8619af30b ---> a8278f609d9aStep 5/5 : CMD R -q -e "source('$SCRIPT')" ---> Running in 765bafeb8681Removing intermediate container 765bafeb8681 ---> ff7d7b09dffbSuccessfully built ff7d7b09dffbSuccessfully tagged stackoverflow:latestdaniel@nuest /tmp/stackoverflow []$ docker run --rm -it -v $(pwd):/workspace stackoverflow> source('script.R')script.R data.csv > > 

Alternatively, you could try passing the data path as an argument to your script file, see https://swcarpentry.github.io/r-novice-inflammation/05-cmdline/

As an aside: It is better to pin a specific R version than to use :latest, for reproducibility.