Skip to main content

Jupyterhub on Kubernetes

JupyterHub main page

Introduction

Deploying Jupyterhub to a kubernetes cluster allows multiple users an easy way to develop, run and test their notebooks. The hub takes care of spawning Jupyter instances for each user and pruning their pods when not in use anymore. Each new user gets their own volume (through a persistent volume claim), containing all their notebooks, files and working environment. More information on https://zero-to-jupyterhub.readthedocs.io/en/latest/index.html

In this case we are deploying it on a bare-metal K3S Rancher cluster, so there are a few specifics, mainly that we're using Traefik and aren't in a cloud with upstream load balancers. PersistentVolumes are created using an nfs-client-provisioner (set as default in the cluster). 

Installation

Create a directory to keep the config file and generate an openssl token. Copy the generated token into the config file.

$ mkdir jupyterhub
$ cd jupyterhub/
$ openssl rand -hex 32
<<token from openssl>>
$ vi config.yaml

We're using below config file, since we need a few specific settings as we're running on a bare-metal k3s cluster. First of all, we ask the proxy to just use ClusterIP, rather than trying to use a LoadBalancer and get an external IP (which it wouldn't get anyway, so it would stay in the pending state).

Next, we specify and configure that we'll use an ingress, passing it a DNS name that resolves to the Virtual IP that floats between the nodes of the cluster. 

Finally we set the default environment to use the newer /lab frontend, and specify a whitelist of users. Admin is enabled as admin user.  
 

proxy:
    secretToken: "<<token from openssl>>"
    service:
        type: ClusterIP
ingress:
    enabled: true
    hosts:
        - <<Domain name pointing to virtual IP>>
singleuser:
    defaultUrl: "/lab"
auth:
    admin:
        users:
            - admin
    whitelist:
        users:
            - robo
            - admin

With the config file setup, we install Jupyterhub from the Helm chart:
 

$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
$ helm repo update
$ RELEASE=jhub
$ NAMESPACE=jhub

$ helm upgrade --cleanup-on-fail   --install $RELEASE jupyterhub/jupyterhub   --namespace $NAMESPACE   --create-namespace   --version=0.10.2   --values config.yaml

After a while, all pods get running state. Each node has an image puller, we have a proxy pod and a hub pod. There are three services and an ingress making the proxy-public available.  

$ kubectl get pods -n jhub                                                                

NAME                              READY   STATUS    RESTARTS   AGE
continuous-image-puller-4dvd7     1/1     Running   0          86m
continuous-image-puller-kcbh7     1/1     Running   0          86m
continuous-image-puller-q6gxr     1/1     Running   0          86m
hub-649b66ff7b-hcqsn              1/1     Running   0          28m
proxy-67b4566548-q94j5            1/1     Running   0          28m

$ kubectl get svc -n jhub
NAME           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
hub            ClusterIP   10.43.26.69    <none>        8081/TCP   90m
proxy-api      ClusterIP   10.43.95.98    <none>        8001/TCP   90m
proxy-public   ClusterIP   10.43.104.12   <none>        80/TCP     90m

$ kubectl get ingress -n jhub
NAME         CLASS    HOSTS               ADDRESS     PORTS   AGE
jupyterhub   <none>   <<dns name>>        <<crntnod>> 80      90m

Further steps

With this working, we can now further configure the multi user Jupyterhub, and configure our environments. For more information see the official site https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/index.html   

 

Department