How to persist an R workspace across multiple CMD statements in a Dockerfile?
If you want to make the data path configurable via an environment variable, then I would recommend to access that variable in your script with Sys.getenv()
. This also allows you to use Rscript
instead of R -e "source...
.
Here is what worked for me:
script.R
cat(Sys.getenv('SCRIPT'), '\n');cat(Sys.getenv('DATA'), '\n')
Dockerfile
FROM r-base:latestENV SCRIPT="script.R"ENV DATA="data.csv"WORKDIR /workspaceCMD R -q -e "source('$SCRIPT')"# alternative: CMD Rscript $SCRIPT
Usage
daniel@nuest /tmp/stackoverflow []$ docker build --tag stackoverflow .Sending build context to Docker daemon 4.608kBStep 1/5 : FROM r-base:latest ---> 46edce0e80afStep 2/5 : ENV SCRIPT="script.R" ---> Using cache ---> 8f26d34d9c0aStep 3/5 : ENV DATA="data.csv" ---> Using cache ---> 16c83c16a4c8Step 4/5 : WORKDIR /workspace ---> Running in fce8619af30bRemoving intermediate container fce8619af30b ---> a8278f609d9aStep 5/5 : CMD R -q -e "source('$SCRIPT')" ---> Running in 765bafeb8681Removing intermediate container 765bafeb8681 ---> ff7d7b09dffbSuccessfully built ff7d7b09dffbSuccessfully tagged stackoverflow:latestdaniel@nuest /tmp/stackoverflow []$ docker run --rm -it -v $(pwd):/workspace stackoverflow> source('script.R')script.R data.csv > >
Alternatively, you could try passing the data path as an argument to your script file, see https://swcarpentry.github.io/r-novice-inflammation/05-cmdline/
As an aside: It is better to pin a specific R version than to use :latest
, for reproducibility.