Jupyterhub on Kubernetes
Deploying Jupyterhub to a kubernetes cluster allows multiple users an easy way to develop, run and test their notebooks. The hub takes care of spawning Jupyter instances for each user and pruning their pods when not in use anymore. Each new user gets their own volume (through a persistent volume claim), containing all their notebooks, files and working environment. More information on https://zero-to-jupyterhub.readthedocs.io/en/latest/index.html
In this case we are deploying it on a bare-metal K3S Rancher cluster, so there are a few specifics, mainly that we're using Traefik and aren't in a cloud with upstream load balancers. PersistentVolumes are created using an nfs-client-provisioner (set as default in the cluster).
Create a directory to keep the config file and generate an openssl token. Copy the generated token into the config file.
$ mkdir jupyterhub $ cd jupyterhub/ $ openssl rand -hex 32 <<token from openssl>> $ vi config.yaml
We're using below config file, since we need a few specific settings as we're running on a bare-metal k3s cluster. First of all, we ask the proxy to just use ClusterIP, rather than trying to use a LoadBalancer and get an external IP (which it wouldn't get anyway, so it would stay in the pending state).
Next, we specify and configure that we'll use an ingress, passing it a DNS name that resolves to the Virtual IP that floats between the nodes of the cluster.
Finally we set the default environment to use the newer /lab frontend, and specify a whitelist of users. Admin is enabled as admin user.
proxy: secretToken: "<<token from openssl>>" service: type: ClusterIP ingress: enabled: true hosts: - <<Domain name pointing to virtual IP>> singleuser: defaultUrl: "/lab" auth: admin: users: - admin whitelist: users: - robo - admin
With the config file setup, we install Jupyterhub from the Helm chart:
$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/ $ helm repo update $ RELEASE=jhub $ NAMESPACE=jhub $ helm upgrade --cleanup-on-fail --install $RELEASE jupyterhub/jupyterhub --namespace $NAMESPACE --create-namespace --version=0.10.2 --values config.yaml
After a while, all pods get running state. Each node has an image puller, we have a proxy pod and a hub pod. There are three services and an ingress making the proxy-public available.
$ kubectl get pods -n jhub NAME READY STATUS RESTARTS AGE continuous-image-puller-4dvd7 1/1 Running 0 86m continuous-image-puller-kcbh7 1/1 Running 0 86m continuous-image-puller-q6gxr 1/1 Running 0 86m hub-649b66ff7b-hcqsn 1/1 Running 0 28m proxy-67b4566548-q94j5 1/1 Running 0 28m $ kubectl get svc -n jhub NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hub ClusterIP 10.43.26.69 <none> 8081/TCP 90m proxy-api ClusterIP 10.43.95.98 <none> 8001/TCP 90m proxy-public ClusterIP 10.43.104.12 <none> 80/TCP 90m $ kubectl get ingress -n jhub NAME CLASS HOSTS ADDRESS PORTS AGE jupyterhub <none> <<dns name>> <<crntnod>> 80 90m
With this working, we can now further configure the multi user Jupyterhub, and configure our environments. For more information see the official site https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/index.html