Setting up a DataCrunch.io server for fast.ai & jupyter notebooks

This article describes how to setup your DataCrunch.io server to use it with fast.ai and jupyter notebooks.

Option 1: Using our Fastai image
If you use our Fastai image, the work is completely done for you so you can get to work instantly.
Option 2: Manual installation starting from a CUDA image.

Step 1. Installing Conda
Important! Create a user first, do not install conda as root, you will not be able to access your environments as a different user. The process to install a user is described in our documentation.

The extended steps to install can be found here. Below you will find a simplified list of steps that are sufficient when using Ubuntu 18.04. We start by downloading the installer script and running it:

wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh

chmod +x Anaconda3-2020.02-Linux-x86_64.sh

./Anaconda3-2020.02-Linux-x86_64.sh

Choose yes for default installation path and yes for conda to initialize.

When installation is done, close your SSH session and open it up again. Let’s confirm our installation is up to date by running:

conda install conda

Step 2. Installing fast.ai

We will get started by creating an Anaconda environment for fastai and activating it in tmux;

conda create --name fastai

tmux

conda activate fastai

We run our commands in tmux. That way, if we disconnect our SSH session, we can still pick up our session later, which is useful when running notebooks. You can find more info about tmux below.

Next we go ahead and install the fastai environment;

conda install -c pytorch -c fastai fastai

Step 3. Installing jupyter notebooks

Jupyter notebooks are great for exploring data and trying out things before taking it into production. Lets add it to our fastai conda environment by executing following commands (make sure you run ‘conda activate fastai’ first if you haven’t activated it yet).

conda install jupyter notebook

conda install -c conda-forge jupyter_contrib_nbextensions

Almost there! We can start jupyter notebook with this command: jupyter notebook --ip *your-IP-here* --port 8888

(This requires port 8888 to be open, in case you are not sure check 'sudo ufw status', if status is active and 8888 is not listed, run 'sudo ufw allow 8888')

So in this case we run:

jupyter notebook --ip 166.4.229.111 --port 8888

Copy the the output string (in Putty: selecting the text with your cursor will copy it into your clipboard) into a browser and you should be greeted with your home screen! It should look like this;

“http://166.4.229.111/?token=66ec77028a35bc592552729476d872ccbdf444e06adb5986”

To get started with the course materials, we will need to clone the git repository. If you started notebooks after running tmux, you can now split your window using ctrl+b followed by ". If not using tmux you can start a second SSH session instead.

In your second tmux pane or second SSH session run following command

git clone https://github.com/fastai/fastai.git

or for FastAI v2:

git clone https://github.com/fastai/fastai2.git

Now in jupyter notebooks you can browse the files in your copy of the repository and start following the course.

Extra info about tmux

Pro tip: before starting jupyter notebooks, I recommend using a screen manager such as Tmux on your server (you might need to run ‘sudo apt install tmux’ first to install it):

sudo apt install tmux

tmux

Then press ctrl+b followed by ", you can switch between top & bottom via ctrl+b followed by ctrl + up or down arrow. You can run a command like ‘htop’ or ‘watch nvidia-smi’ on the other panel after starting jupyter notebook, or open a new window.

When we disconnect SSH or lose our SSH connection, we can simply pick up where we left by running 'tmux a' to grab our active session.

Arrow-up