Quick Jupyterhub Setup: Docker + Nginx + https + Letsencrypt + AWS cloud

leangaurav
10 min readJul 18, 2022

I was using Jupyter Notebooks for a long time. But often I’d need to access my notebooks from different machines. Running jupyter on a server and doing reverse ssh was one option, but that doesn’t work well due to ssh key whitelisting.

Later, I came across this project: JupyterHub and it looked like the perfect solution. But it wasn’t so easy to get it working for my purpose. After spending many days experimenting and reading docs, I was finally able to get things working the way I wanted.

I wanted the setup to have:

  1. A custom domain/subdomain
  2. Some way to authenticate and login to the server
  3. Allow creation of additional users, with their own notebook servers
  4. Run things via docker to allow less privilege for other user notebooks
  5. Bonus: minimal setup and easy cleanup 😀

The final setup looked like this:

  1. Host/Server: AWS EC2 instance using Ubuntu
  2. Letsencrypt for SSL certificates to enable https (not mandatory)
  3. Docker for running Jupyterhub, Nginx
  4. Nginx reverse proxy
  5. Jupyterhub
  6. Jupyter notebook
  7. Github OAuth for authentication and user creation

and nothing to be installed directly on the server, apart from docker 🙌🏻. So easy cleanup.

Some improvements to this post might come in future like: using docker spawner for running individual user notebooks instead of the default local process spawner in JupyterHub.

A big thanks to the JupyterHub team and all the contributors for creating such a wonderful thing.

Do take a look at the project here https://jupyter.org/hub

This guide, will go step by step explaining each part of JupyterHub setup. This should be doable in around 10 min if everything goes well 😉.

We’ll break down the entire thing into two parts:

  1. Running JupyterHub with Github OAuth without HTTPS
  2. Setting up traffic to be served over HTTPS if you wish to do that (Recommended)

Lets get some stuff done ⚒👨🏻‍🏭💪🏻

Part-1: Setting up JupyterHub

Before going further into setup, ensure you have following:

  1. A server instance: ability to SSH and run commands with sudo access
  2. Public IP: note the public IP Address of the server above
  3. Open ports 443, 80 for incoming traffic:
    - 443: If HTTPs is required
    - 80: used while installing SSL certificates
    Instead of 80, other ports can also work. but if you plan do https setup opening ports 80 and 443 is important.
    To see check if the port is really open, see the guide at end.
  4. Install docker on the server/host: follow the guides on docker’s website. The guide for Ubuntu can be found here.

SSH to your server, clone this Github repo and go inside the cloned folder

git clone https://github.com/leangaurav/jupyterhub_docker.git
cd jupyterhub_docker

The folder structure looks like this:

├── .env
├── README.md
├── config
│ ├── jupyterhub_config.py
│ └── requirements.txt
├── docker
│ └── jupyterhub.Dockerfile
└── docker-compose.yaml

To view the files from terminal, you can use the vi editor like vi docker-compose.yaml etc.

Now lets look at each of the files step by step and make the required changes:

Step-1: docker-compose.yaml

❌ No modifications required in this file.

version: "3.3"
services:

jupyterhub:
container_name: 'jupyterhub'
build:
context: .
dockerfile: docker/jupyterhub.Dockerfile
environment:
PYTHONUNBUFFERED: 1
JUPYTERHUB_IP: ${JUPYTERHUB_IP}
JUPYTERHUB_PORT: ${JUPYTERHUB_PORT}
JUPYTERHUB_CONTAINER_NAME: ${JUPYTERHUB_CONTAINER_NAME}
JUPYTERHUB_DATA: ${JUPYTERHUB_DATA}
ADMIN_USERS:
OAUTH_CALLBACK_URL:
OAUTH_CLIENT_ID:
OAUTH_CLIENT_SECRET:
volumes:
- "${JUPYTERHUB_DATA}:${JUPYTERHUB_DATA}"
- ${JUPYTERHUB_USERS_HOME}:/home
network_mode: host
restart: always

This file contains the configuration for the container that will be spawned by docker. All the environment variables are specified here. All values prefixed with $ come from .env file

The file contains:
- volume bind mount: mounts a host directory to a directory inside container
- environment variables mapping
- network_mode: host → use host network

Step-2: .env

✅This needs to be updated.

JUPYTERHUB_IP=127.0.0.1
JUPYTERHUB_PORT=80
JUPYTERHUB_DATA=/home/ubuntu/jupyter_hub/data # update folder path
JUPYTERHUB_USERS_HOME=/home/ubuntu/jupyter_hub/home # update folder path
ADMIN_USERS=leangaurav,leangaurav-me # update this to comma separated list of usernames
OAUTH_CALLBACK_URL=http://<public IP>:80/hub/oauth_callback
OAUTH_CLIENT_ID=
OAUTH_CLIENT_SECRET=

Fields to be edited are highlighted in bold. Below is a description of each field with few important ones marked with **.

  1. JUPYTERHUB_PORT
    Change this to the open port. Same port needs to be used in Github OAuth app’s Authorization callback url .
  2. JUPYTERHUB_DATA
    The location where session data etc. will get stored. This should be a directory on the host.
    Default value uses /home/ubuntu which is the default user location on most Ubuntu based VM instances on Aws and GCP. Folders jupyter_hub/data will be created by docker.
    Note: keep the above directory separate from github repo clone
  3. JUPYTERHUB_USERS_HOME
    This is where each user’s home directory will be created for their respective jupyter notebook servers. Update based on your preference.
  4. ADMIN_USERS**
    Make sure to change this comma separated list of admin users usernames. Since we will be using Github OAuth, I have it set to my github username by default.
  5. OAUTH_CALLBACK_URL
    Update the url with <public IP>:<port> of your server. Port must match JUPYTERHUB_PORT. We’ll need it while creating Github OAuth App.
  6. OAUTH_CLIENT_ID
    Set this to the Github OAuth App’s Client ID. Follow this guide to create an OAuth app on Github. Following values need to be set:
    - Callback URL: from above, e.g. http://1.2.3.4:80/hub/oauth_callback
    - Home page: use base URL e.g.http://44.206.69.223:80
    After saving the App, copy the generated SECRET safely.
  7. OAUTH_CLIENT_SECRET
    Use value copied in above step.

Save the .env file after making above changes.

Also check this JupyterHub guide if you wish to configure some other OAuth options.

Step-3: requirements.txt

🆗This doesn’t require modifications.

If you wish to have your jupyter notebook servers pre-configured with some specific libraries/packages, add those packages here.

Step-4: jupyterhub_config.py

❌No modifications required here as everything is configured via .env file

🚥Configuration is over and we can jump onto running and testing things.

In the root of the cloned folder, run
docker compose up --build -d

If things go well, you’ll see logs similar to this

[+] Building 0.4s (11/11) FINISHED
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from jupyterhub.Dockerfile 0.0s
=> => transferring dockerfile: 43B 0.0s
=> [internal] load metadata for docker.io/jupyterhub/jupyterhub:2.3.1 0.3s
=> [1/6] FROM docker.io/jupyterhub/jupyterhub:2.3.1@sha256:426e58ee0c520c25c26843a7f485613dea57aa150a21ffe5aa39ac18c6b0748d 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 118B 0.0s
=> CACHED [2/6] WORKDIR jupyter 0.0s
=> CACHED [3/6] COPY config/jupyterhub_config.py /srv/ 0.0s
=> CACHED [4/6] COPY config/requirements.txt . 0.0s
=> CACHED [5/6] RUN python3 -m pip install --upgrade pip 0.0s
=> CACHED [6/6] RUN pip install -r ./requirements.txt 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:fc12741ccad01b03429a7ceb023b5f6efd54b8f8723f869eaa11df363927fd25 0.0s
=> => naming to docker.io/library/jupyterhub_docker_jupyterhub 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
[+] Running 1/1
⠿ Container jupyterhub Started

Now head over to your browser and paste the <publicIP>:<port>. This should take you to jupyterhub home page with an https warning

Hit Sign In and you’ll see below page:

If things didn’t go well, below options might help:

  1. If there’s an error during Auth, verify the OAuth URL, creds and related config in both .env file and on the Github OAuth App.
  2. Looking up the docker container logs can help docker logs jupyterhub -f
  3. Try removing the container and and restarting things. Following command does everything in one step docker kill jupyterhub && docker container rm jupyterhub && docker compose up --build -d

Hit Launch Server, and you should see the familiar Jupyter notebooks home

Try to create a notebook using New button and run some 🐍 code !!

Cheers 🎉🎊.

Part-2: Moving to HTTPS

Here, I will refer to my handy guide. I’d recommend you to read the entire guide once before doing anything.

❗ Before doing anything else, first kill your jupyterhub container. No need to do anything if it is running on a port other than 80. Also if you opened up ports other than 443 and 80 on the machine like 8000, it’s time you can close them if they were done in Part-1.

You have 2 options:

  1. Follow all the steps mentioned in the repo’s Readme
  2. Follow the blog post/guide till Step-4 . Don’t do Step-4.1 as we’ll do it now, and remember to do Step-5 at the end.

If you have setup the domain correctly and see the nginx 404 page. Then, open the config/nginx.conf file again and paste this on top (remember to change the port 8000 to the port set in .env).

upstream jupyterhub_service {
server 127.0.0.1:8000;
}

In the second server block for port 443 in the config file, paste the following under the ssl certificates part (no need to change anything)

    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA';
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_stapling on;
ssl_stapling_verify on;
add_header Strict-Transport-Security max-age=15768000;
location / {
proxy_pass http://jupyterhub_service;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
# websocket headers
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Scheme $scheme;

proxy_buffering off;
}

So your final nginx.conf will look something like this:

upstream jupyterhub_service {
server 127.0.0.1:8000;
}

server {
listen 80;
listen [::]:80;
server_name <your domain name>;
location / {
rewrite ^ https://$host$request_uri? permanent;
}
location ~ /.well-known/acme-challenge {
allow all;
root /tmp/acme_challenge;
}
}

server {
listen 443 ssl;
listen [::]:443 ssl http2;
server_name <your domain name>;
ssl_certificate /etc/letsencrypt/live/<your domain name>/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/<your domain name>/privkey.pem;

ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA';
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_stapling on;
ssl_stapling_verify on;
add_header Strict-Transport-Security max-age=15768000;
location / {
proxy_pass http://jupyterhub_service;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;

# websocket headers
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Scheme $scheme;

proxy_buffering off;
}
}

Now we need few more changes in our jupyterhub service:

  1. Make sure to update jupyterhub port in .env
    JUPYTERHUB_PORT=8000
  2. Update OAuth callback URL in .env. Remove both IP and port and add domain name. Notice we changed http to https.
    OAUTH_CALLBACK_URL=https://<your domain name>/hub/oauth_callback
  3. Go to Github, and change the OAuth Callback URL in your OAuth App well.
  4. Go to docker-compose.yml in jupyterhub repo and uncomment the network section at end.

Now restart the two services:

  1. The jupyterhub server
    docker compose up --build -d jupyterhub
  2. The nginx container
    docker compose up --build -d nginx

Monitor the logs for both the services if required

docker logs jupyterhub -fdocker logs nginx-service -f

Now head over to your domain and you should see things working just like it did over http without the http warning.

Congratulations 😀.

Learnt something new? Share it further.

Guides

Locating the notebook files on server

Since we have mounted a host volume inside the jupyterhub docker. If you just head over to the folder set for variable JUPYTERHUB_USERS_HOME in .env . You should see a folder for each user having a notebook server.

Checking if port is open using python

Run below command:
sudo python3 -m http.server 80
Now open http://<server public ip>:80 in browser. It should show a list of files. Make sure to not leave it running and stop it by pressing Ctrl + C.

Adding swap space

Adding swap can vary by OS, Vendor etc.
For Linux based systems see this stackoverflow answer.

Adding more users to the JupyterHub Server

If you are admin, then click on Control Panel on the top right.

Then jump to Admin

And here you will see options to manage users

If using github OAuth, add github usernames to the new users list.

Troubleshooting

After sign-in, server fails to start

Look for errors in jupyterhub docker container logs.

If you see folder permission errors for the username in jupyterhub logs like this:

PermissionError: [Errno 13] Permission denied: ‘/home/leangaurav/.local’

Then consider deleting the old jupyter container and getting the container up again.

The server freezes and requires restart

This might be happening on small server instances with low amount of RAM. To identify if this is a RAM issue, open a terminal on the server and run htop command to view live display of memory usage.

Now repeat the operation which causes the server to freeze. If you see ram starting to increase before the server freezes, then consider adding Swap Space to your instance.
I always find this guide useful for debian systems https://www.cloudbooklet.com/how-to-add-swap-space-on-ubuntu-18-04-google-cloud/

After opening notebook: Unable to connect
Make sure you have added the websocket config in nginx config.

--

--

leangaurav

Engineer | Trainer | writes about Practical Software Engineering | Find me on linkedin.com/in/leangaurav | Discuss anything topmate.io/leangaurav