How can I persist agent status for docker based machine

I have installed the enterprise version using docker-compose. I am running an R script in a notebook. This R script depends on many libraries, which take around 40+ minutes to install. However, whenever I restart the docker machine or restart the host machine (an EC2 server), I lose all my installations, and I need to reinstall them. Is there a way to persist the state of the docker machine and use the same docker machine every time for this notebook?

Hello,

There’s no way to persist the state of a particular agent, but instead you can create a custom docker image with the necessary libraries preinstalled and add it to your agents configuration. After that, you can select this agent type for your notebooks instead of the default one.

Best regards,
Stepan Tarasevich
JetBrains

@aprilfire Thank you for the reply. I have tried build a custom docker “Custom R environment without Anaconda”. However, I tried many ways to install some R libraries in that docker file, but no luck. It seems when the docker image is creating, it doesn’t have that virtual R kernel initialised. Only when I select docker from notebook, the R kernel is installed. So all my installed libraries in Dockerfile doesn’t work in my notebook. Here is some ways I tried:

install by Rscript

RUN sudo mkdir -p /opt/anaconda3/envs/minimal/lib/R/library && sudo chown datalore:datalore /opt/anaconda3/envs/minimal/lib/R/library
RUN PATH=/opt/anaconda3/envs/$CUSTOM_ENV_NAME/bin/:$PATH Rscript -e "install.packages(c('pvclust', 'showtext'), dependencies = TRUE, lib = '/opt/anaconda3/envs/minimal/lib/R/library')" 

install by conda

RUN /opt/anaconda3/bin/conda create -n $CUSTOM_ENV_NAME r-base=4.2 r-irkernel r-ggpubr r-factoextra r-cluster  && \
    /opt/anaconda3/envs/$CUSTOM_ENV_NAME/bin/R -e "install.packages(c('showtext'), repos='https://cran.rstudio.com/')"  # 

Directly run after IRkernel but before Rscript -e “IRkernel::installspec(sys_prefix=TRUE)”

   Rscript -e "install.packages(c('repr', 'IRdisplay', 'IRkernel'), type = 'source')" && \
 Rscript -e "install.packages(c('pvclust', 'showtext'), dependencies = TRUE)

None of them success… Can you give me some guidance how to install R libraries in Dockerfile?

It seems that in the first example (install by Rscript) there are both occurances of $CUSTOM_ENV_NAME and minimal , maybe getting rid of minimal would help

Does not work, it defaults to /usr/local/lib/R/site-library but not writable.

[8/9] RUN PATH=/opt/anaconda3/envs/myenv/bin/:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin Rscript -e “install.packages(c(‘pvclust’, ‘showtext’,), dependencies = TRUE)”:
0.722 Installing packages into ‘/usr/local/lib/R/site-library’
0.722 (as ‘lib’ is unspecified)
0.723 Warning in install.packages(c(“pvclust”, “showtext”, “parallelly”, “ggpubr”, :
0.724 ‘lib = “/usr/local/lib/R/site-library”’ is not writable

Here is my current dockerfile:

FROM jetbrains/datalore-agent:2023.3
ENV CUSTOM_ENV_NAME myenv
USER root
RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y -q --no-install-recommends libzmq3-dev libcurl4-openssl-dev libssl-dev r-base make g++ libharfbuzz-dev libfribidi-dev libtiff-dev apt-file cmake libfreetype6-dev && \
    Rscript -e "install.packages(c('repr', 'IRdisplay', 'IRkernel'), type = 'source')" && \
    rm -rf /var/lib/apt/lists/* && apt-get clean
RUN sudo chown -R datalore:datalore /home/datalore
USER datalore
RUN mkdir -p /opt/anaconda3/envs/$CUSTOM_ENV_NAME
RUN /opt/python/bin/python -m venv /opt/anaconda3/envs/$CUSTOM_ENV_NAME
RUN /opt/anaconda3/envs/$CUSTOM_ENV_NAME/bin/pip install ipykernel==5.5.3 ipython==7.31.1 ipython_genutils==0.2.0 jedi==0.17.2
RUN PATH=/opt/anaconda3/envs/$CUSTOM_ENV_NAME/bin/:$PATH Rscript -e "IRkernel::installspec(sys_prefix=TRUE)"
# RUN sudo mkdir -p /opt/anaconda3/envs/minimal/lib/R/library && sudo chown datalore:datalore /opt/anaconda3/envs/minimal/lib/R/library
RUN PATH=/opt/anaconda3/envs/$CUSTOM_ENV_NAME/bin/:$PATH Rscript -e "install.packages(c('pvclust', 'showtext'), dependencies = TRUE)" 

# RUN /opt/anaconda3/bin/conda create -n $CUSTOM_ENV_NAME r-base=4.2 r-irkernel r-ggpubr r-factoextra r-cluster  && \
#     /opt/anaconda3/envs/$CUSTOM_ENV_NAME/bin/R -e "install.packages(c('showtext'), repos='https://cran.rstudio.com/')"  # Install 'showtext' package

RUN /opt/datalore/build_code_insight_data.sh /opt/anaconda3/envs/$CUSTOM_ENV_NAME

It’s not writable by datalore user but it’s writable by root. We should definitely introduce better default configuration, but at the moment you can
a) just add your packages to the first install.packages invocation, or
b) write something like

USER root
RUN RScript install.packages
USER datalore

c) chmod the default lib dir to be writable by user

1 Like