How to sync or copy files without the web interface?

Is there any way to sync notebooks in Datalore with a Git repository other than manually downloading every single file? Anything like rsync, scp, Git or even a properitary protocol with clients for different platforms would be helpful.

At first glance I thought that ssh login is supported, but the option in the settings allows only to set a private key for outbound connections, not a public key for inbound connections.

A workaround using IPython magic to push or pull files from Datalore elswhere should be possible, right?

Hello,

Yes, you could utilize ipython magic to download files from the remote server, both scp and git are available by default:

Files will be cloned into the working directory and will be available via “Attached Files” pane.

There is also a built-in connector for AWS S3 buckets (Tools → Attached Datasources):

Thanks for the instructions! I was curios mostly about how to download all Datalore notebooks to my computer. I’d like to store a local copy of all notebooks, just in case something goes wrong right before the presentation or so. So I guess I’d need to push the notebooks somewhere.

I think this not so easy as I thought it would be. Notebooks live in separated environments, right? So I can’t use one notebook to push all notebooks to another server. I can’t even access the notebook itself via IPython magic.

A bit off-topic: There is also no way to share downloaded files between notebooks, because they are always attached to a single notebook, right?

Sorry, I misinterpret your question.

Luckily, this case is also covered :slight_smile: It is possible to export multiple notebooks from Datalore to *.ipynb or *.datalore files:

  1. shift click on the first notebook
  2. shift click on the last notebook
  3. invoke context menu for any selected notebook
  4. select suitable “Export” option

It could be annoying to confirm each file, but it is much more convenient than exporting all the notebooks individually.

“Notebook files” are attached to a specific notebook, right, but there is also a “Workspace files” folder, that could be attached to any notebook within a workspace.

From the File System view this folder is available from the sidebar menu:

And in the editor you need to attach this folder explicitly via “Attached files” pane:

Screenshot 2021-03-04 at 14.03.32

Screenshot 2021-03-04 at 14.03.56

We’ve stumbled upon a surprising issue related to this question. Downloaded files are removed automatically sometimes when using the Datalore kernel. I guess this intended, to make the execution reproducible. But it is still very surprising and not usable in practice if files are too large. We need to download more than 1GB, so re-downloading this every time we have to change something in a cell above or restart the kernel is not really an option.

What makes it also more difficult to use might be second issue: the file browser does not show the same thing as os.path.exists(fname). I restart the kernel, clear all outputs, then execute the first two cells and refresh the file browser, but the file is still displayed:

I guess it is somehow still there from the previous run of the kernel:

This might be related to this issue:

In Datalore kernel all downloaded (or generated) files will be removed during the re-evaluation if the corresponding code is commented out or when the entire cell is deleted, for example:

#%%
import requests
#%%
url = 'https://datalore.jetbrains.com/logo.ico'
r = requests.get(url, allow_redirects=True)

open('logo.ico', 'wb').write(r.content)
#%%
print(10)
  • evaluate all cells
    (logo.ico attached to the notebook)
  • comment out the code in the second cell
  • evaluate the third cell
    (second cell re-evaluated, file removed)

It is indeed the intended behaviour, but I see your point, it is counterproductive when working with heavy files. I will create a ticket and discuss it with the team. Thank you!

Did you try to refresh the pane using the UI button?

Screenshot 2021-04-28 at 17.51.03

Yes, exactly, this is what I meant with “refresh the file browser”.

I’m sorry, I misread your comment. Yes, this is also the expected behavior – if the file was created during the previous run, it won’t be automatically removed when the computation is stopped and re-started, and it won’t be removed even if the corresponding cell is deleted in the new run. It is definitely not very intuitive. Will add this point to the ticket.

Hm the thing is, on the first screenshot, the file is displayed in the file browser but os.path.exists() returns False. This shouldn’t be the case, right?

Unfortunately, I didn’t manage to reproduce this behaviour. It is only possible to get such state within a single run – create file with the third cell, and check it from the code in the second cell – in such case os.path.exist will return “False” although the file is actually attached to the notebook.