In what ways can I interact with a datalore-notebook programmatically?

bjuergens · November 16, 2021, 9:32am

I am new to datalore and investigating ways to integrate it into our accounting workflow. My general idea is to have one expert design a report as a datalore-notebook and then use this notebook as a template to generate different reports based on different subsets of the database.

Is there an API to download a shared datalore notebook?
Is there an API to download a published datalore notebook?
Is there an API to control a notebook from the outside?
- e.g. I want to replace an attached csv-file, then execute the notebook on new data and then export the result to PDF. Can this be done automatically, e.g. via an API?
Can a downloaded/exported .datalore-file be run headless?
- similar to jupyter nbconvert --execute for .ipynb-files

igro · November 17, 2021, 5:50pm

1,2,3. No, unfortunately there is no public API at the moment.
4. Also not possible, since there is no standalone runner for .datalore files.

At the moment we are more focused on developing various new features, for example we’re currenlty working on the new type of the published reports - hopefully they will be widely available in the upcoming releases. Please stay tuned in!

bjuergens · November 18, 2021, 8:53am

Thanks. I would like to ask a follow up question

Can I expect the export as ipynb from datalore to be consistent over time? e.g. when I export the same datalore notebook today and a year from now, will the resulting ipynb be exactly the same, or can I expect some difference in the json structure?

I am considering a workflow where the expert uses datalore to create a notebook, then exports it as ipynb, and then uploads the ipynb-file as a template to the accounting-backend where I use nbconvert to generate the actual reports.

Does this use-case fit into how datalore is supposed to be used, or should I expect lots of unforeseen complications along the way

igro · December 1, 2021, 6:33pm

Actually no, the resulting file may change over time, because we are going to unify and support some new meta tags in .ipynb format.
I’m afraid some complications could arise, yes, but hopefully new type of reports and other planned features for the next releases will cover such a workflow completely in Datalore.

zaytsev · August 30, 2022, 10:17am

So I was also about to ask about API availability, but it looks like I’ve found the answer already, and unfortunately, it’s negative

Let me try to explain what use case I have in mind:

We are thinking of setting up hosted BI dashboards for our customers using our app with something Jupyter-like and try to offload as much of the infrastructure and hosting to an external out of the box solution.

An ideal solution would have a team account for our company for some managed Jupyter-like software that also takes care of the computation.
There we could set up a template workspace, where our engineers would be developing notebooks with sample reports.
Pulumi would then clone this template report creating a workspace per customer, and adjusting the data sources / queries to limit the data to that of the customer.
It would also adjust sharing settings to share corresponding workspaces with our customers, requiring them to register an account with an email address - them being authenticated with the solution automatically makes sure they are authorised to see the workspace.
- This is to say that sharing by links only is not secure enough for our purposes.

Of course, we can roll our own solution, where we would develop sample reports locally, manage the notebooks in a GitHub repository and use GitHub Actions with custom runners that would run something like jupyter nbconvert --execute on AWS as the original poster suggested. The artefacts can then be stored in AWS and it also could take care of the authentication via CloudFront and Lambdas. But… would be nice to be able to use a better hosted solution instead of endlessly spreading our resources to rebuild the whole world.