Download data in source code


I’m testing Datalore and working with Colab. There I could simply execute the following code:

stopword_list = stopwords.words(‘english’)

nlp = spacy.load(‘de_core_news_sm’)

This does not seem to go directly here. Is there an alternative way to do this?


First of all, you have to install nltk and spacy using “Tools -> Library Manager”. Installing spacy may take a couple of minutes.

Unfortunately spacy.cli.* commands currently don’t work because of a bug. We’ll deploy a fix in a few days, thank you for bringing our attention here!

Currently you can download spacy modules using subprocess. Full code of your example:

import nltk'stopwords')
from nltk.corpus import stopwords
stopword_list = stopwords.words('english')
import spacy
import subprocess
print(subprocess.getoutput("python -m spacy download de_core_news_sm"))
nlp = spacy.load('de_core_news_sm')