Quick Jupyterhub Setup: Docker + Nginx + https + Letsencrypt + AWS cloud
I was using Jupyter Notebooks for a long time. But often I’d need to access my notebooks from different machines. Running jupyter on a server and doing reverse ssh was one option, but that doesn’t work well due to ssh key whitelisting.
Later, I came across this project: JupyterHub and it looked like the perfect solution. But it wasn’t so easy to get it working for my purpose. After spending many days experimenting and reading docs, I was finally able to get things working the way I wanted.
I wanted the setup to have:
- A custom domain/subdomain
- Some way to authenticate and login to the server
- Allow creation of additional users, with their own notebook servers
- Run things via docker to allow less privilege for other user notebooks
- Bonus: minimal setup and easy cleanup 😀
The final setup looked like this:
- Host/Server: AWS EC2 instance using Ubuntu
- Letsencrypt for SSL certificates to enable https (not mandatory)
- Docker for running Jupyterhub, Nginx
- Nginx reverse proxy
- Jupyterhub
- Jupyter notebook
- Github OAuth for authentication and user creation
and nothing to be installed directly on the server, apart from docker 🙌🏻. So easy cleanup.
Some improvements to this post might come in future like: using docker spawner for running individual user notebooks instead of the default local process spawner in JupyterHub.
A big thanks to the JupyterHub team and all the contributors for creating such a wonderful thing.
Do take a look at the project here https://jupyter.org/hub
This guide, will go step by step explaining each part of JupyterHub setup. This should be doable in around 10 min if everything goes well 😉.
We’ll break down the entire thing into two parts:
- Running JupyterHub with Github OAuth without HTTPS
- Setting up traffic to be served over HTTPS if you wish to do that (Recommended)
Lets get some stuff done ⚒👨🏻🏭💪🏻
Part-1: Setting up JupyterHub
Before going further into setup, ensure you have following:
- A server instance: ability to SSH and run commands with
sudo
access - Public IP: note the public IP Address of the server above
- Open ports 443, 80 for incoming traffic:
- 443: If HTTPs is required
- 80: used while installing SSL certificates
Instead of 80, other ports can also work. but if you plan do https setup opening ports 80 and 443 is important.
To see check if the port is really open, see the guide at end. - Install docker on the server/host: follow the guides on docker’s website. The guide for Ubuntu can be found here.
SSH to your server, clone this Github repo and go inside the cloned folder
git clone https://github.com/leangaurav/jupyterhub_docker.git
cd jupyterhub_docker
The folder structure looks like this:
├── .env
├── README.md
├── config
│ ├── jupyterhub_config.py
│ └── requirements.txt
├── docker
│ └── jupyterhub.Dockerfile
└── docker-compose.yaml
To view the files from terminal, you can use the vi editor like
vi docker-compose.yaml
etc.
Now lets look at each of the files step by step and make the required changes:
Step-1: docker-compose.yaml
❌ No modifications required in this file.
version: "3.3"
services:
jupyterhub:
container_name: 'jupyterhub'
build:
context: .
dockerfile: docker/jupyterhub.Dockerfile
environment:
PYTHONUNBUFFERED: 1
JUPYTERHUB_IP: ${JUPYTERHUB_IP}
JUPYTERHUB_PORT: ${JUPYTERHUB_PORT}
JUPYTERHUB_CONTAINER_NAME: ${JUPYTERHUB_CONTAINER_NAME}
JUPYTERHUB_DATA: ${JUPYTERHUB_DATA}
ADMIN_USERS:
OAUTH_CALLBACK_URL:
OAUTH_CLIENT_ID:
OAUTH_CLIENT_SECRET:
volumes:
- "${JUPYTERHUB_DATA}:${JUPYTERHUB_DATA}"
- ${JUPYTERHUB_USERS_HOME}:/home
network_mode: host
restart: always
This file contains the configuration for the container that will be spawned by docker. All the environment variables are specified here. All values prefixed with $
come from .env
file
The file contains:
- volume bind mount: mounts a host directory to a directory inside container
- environment variables mapping
- network_mode: host → use host network
Step-2: .env
✅This needs to be updated.
JUPYTERHUB_IP=127.0.0.1
JUPYTERHUB_PORT=80
JUPYTERHUB_DATA=/home/ubuntu/jupyter_hub/data # update folder path
JUPYTERHUB_USERS_HOME=/home/ubuntu/jupyter_hub/home # update folder pathADMIN_USERS=leangaurav,leangaurav-me # update this to comma separated list of usernames
OAUTH_CALLBACK_URL=http://<public IP>:80/hub/oauth_callback
OAUTH_CLIENT_ID=
OAUTH_CLIENT_SECRET=
Fields to be edited are highlighted in bold. Below is a description of each field with few important ones marked with **.
- JUPYTERHUB_PORT
Change this to the open port. Same port needs to be used in Github OAuth app’sAuthorization callback url
. - JUPYTERHUB_DATA
The location where session data etc. will get stored. This should be a directory on the host.
Default value uses/home/ubuntu
which is the default user location on most Ubuntu based VM instances on Aws and GCP. Foldersjupyter_hub/data
will be created by docker.
Note: keep the above directory separate from github repo clone - JUPYTERHUB_USERS_HOME
This is where each user’s home directory will be created for their respective jupyter notebook servers. Update based on your preference. - ADMIN_USERS**
Make sure to change this comma separated list of admin users usernames. Since we will be using Github OAuth, I have it set to my github username by default. - OAUTH_CALLBACK_URL
Update the url with <public IP>:<port> of your server. Port must matchJUPYTERHUB_PORT
. We’ll need it while creating Github OAuth App. - OAUTH_CLIENT_ID
Set this to the Github OAuth App’s Client ID. Follow this guide to create an OAuth app on Github. Following values need to be set:
- Callback URL: from above, e.g.http://1.2.3.4:80/hub/oauth_callback
- Home page: use base URL e.g.http://44.206.69.223:80
After saving the App, copy the generated SECRET safely. - OAUTH_CLIENT_SECRET
Use value copied in above step.
Save the .env
file after making above changes.
Also check this JupyterHub guide if you wish to configure some other OAuth options.
Step-3: requirements.txt
🆗This doesn’t require modifications.
If you wish to have your jupyter notebook servers pre-configured with some specific libraries/packages, add those packages here.
Step-4: jupyterhub_config.py
❌No modifications required here as everything is configured via .env
file
🚥Configuration is over and we can jump onto running and testing things.
In the root of the cloned folder, run docker compose up --build -d
If things go well, you’ll see logs similar to this
[+] Building 0.4s (11/11) FINISHED
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from jupyterhub.Dockerfile 0.0s
=> => transferring dockerfile: 43B 0.0s
=> [internal] load metadata for docker.io/jupyterhub/jupyterhub:2.3.1 0.3s
=> [1/6] FROM docker.io/jupyterhub/jupyterhub:2.3.1@sha256:426e58ee0c520c25c26843a7f485613dea57aa150a21ffe5aa39ac18c6b0748d 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 118B 0.0s
=> CACHED [2/6] WORKDIR jupyter 0.0s
=> CACHED [3/6] COPY config/jupyterhub_config.py /srv/ 0.0s
=> CACHED [4/6] COPY config/requirements.txt . 0.0s
=> CACHED [5/6] RUN python3 -m pip install --upgrade pip 0.0s
=> CACHED [6/6] RUN pip install -r ./requirements.txt 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:fc12741ccad01b03429a7ceb023b5f6efd54b8f8723f869eaa11df363927fd25 0.0s
=> => naming to docker.io/library/jupyterhub_docker_jupyterhub 0.0sUse 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
[+] Running 1/1
⠿ Container jupyterhub Started
Now head over to your browser and paste the <publicIP>:<port>
. This should take you to jupyterhub home page with an https warning
Hit Sign In and you’ll see below page:
If things didn’t go well, below options might help:
- If there’s an error during Auth, verify the OAuth URL, creds and related config in both
.env
file and on the Github OAuth App. - Looking up the docker container logs can help
docker logs jupyterhub -f
- Try removing the container and and restarting things. Following command does everything in one step
docker kill jupyterhub && docker container rm jupyterhub && docker compose up --build -d
Hit Launch Server, and you should see the familiar Jupyter notebooks home
Try to create a notebook using New button and run some 🐍 code !!
Cheers 🎉🎊.
Part-2: Moving to HTTPS
Here, I will refer to my handy guide. I’d recommend you to read the entire guide once before doing anything.
❗ Before doing anything else, first kill your
jupyterhub
container. No need to do anything if it is running on a port other than 80. Also if you opened up ports other than 443 and 80 on the machine like 8000, it’s time you can close them if they were done in Part-1.
You have 2 options:
- Follow all the steps mentioned in the repo’s
Readme
- Follow the blog post/guide till Step-4 . Don’t do Step-4.1 as we’ll do it now, and remember to do Step-5 at the end.
If you have setup the domain correctly and see the nginx 404 page. Then, open the config/nginx.conf
file again and paste this on top (remember to change the port 8000 to the port set in .env
).
upstream jupyterhub_service {
server 127.0.0.1:8000;
}
In the second server block for port 443 in the config file, paste the following under the ssl certificates part (no need to change anything)
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA';
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_stapling on;
ssl_stapling_verify on;
add_header Strict-Transport-Security max-age=15768000;
location / {
proxy_pass http://jupyterhub_service;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
# websocket headers
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Scheme $scheme;
proxy_buffering off;
}
So your final nginx.conf
will look something like this:
upstream jupyterhub_service {
server 127.0.0.1:8000;
}
server {
listen 80;
listen [::]:80;
server_name <your domain name>;
location / {
rewrite ^ https://$host$request_uri? permanent;
}
location ~ /.well-known/acme-challenge {
allow all;
root /tmp/acme_challenge;
}
}
server {
listen 443 ssl;
listen [::]:443 ssl http2;
server_name <your domain name>;
ssl_certificate /etc/letsencrypt/live/<your domain name>/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/<your domain name>/privkey.pem;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA';
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_stapling on;
ssl_stapling_verify on;
add_header Strict-Transport-Security max-age=15768000;
location / {
proxy_pass http://jupyterhub_service;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
# websocket headers
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Scheme $scheme;
proxy_buffering off;
}
}
Now we need few more changes in our jupyterhub
service:
- Make sure to update jupyterhub port in
.env
JUPYTERHUB_PORT=8000
- Update OAuth callback URL in
.env
. Remove both IP and port and add domain name. Notice we changedhttp
tohttps
.OAUTH_CALLBACK_URL=https://<your domain name>/hub/oauth_callback
- Go to Github, and change the OAuth Callback URL in your OAuth App well.
- Go to
docker-compose.yml
in jupyterhub repo and uncomment the network section at end.
Now restart the two services:
- The jupyterhub server
docker compose up --build -d jupyterhub
- The nginx container
docker compose up --build -d nginx
Monitor the logs for both the services if required
docker logs jupyterhub -fdocker logs nginx-service -f
Now head over to your domain and you should see things working just like it did over http without the http warning.
Congratulations 😀.
Learnt something new? Share it further.
Guides
Locating the notebook files on server
Since we have mounted a host volume inside the jupyterhub docker. If you just head over to the folder set for variable JUPYTERHUB_USERS_HOME
in .env
. You should see a folder for each user having a notebook server.
Checking if port is open using python
Run below command:sudo python3 -m http.server 80
Now open http://<server public ip>:80
in browser. It should show a list of files. Make sure to not leave it running and stop it by pressing Ctrl + C
.
Adding swap space
Adding swap can vary by OS, Vendor etc.
For Linux based systems see this stackoverflow answer.
Adding more users to the JupyterHub Server
If you are admin, then click on Control Panel
on the top right.
Then jump to Admin
And here you will see options to manage users
If using github OAuth, add github usernames to the new users list.
Troubleshooting
After sign-in, server fails to start
Look for errors in jupyterhub
docker container logs.
If you see folder permission errors for the username in jupyterhub logs like this:
PermissionError: [Errno 13] Permission denied: ‘/home/leangaurav/.local’
Then consider deleting the old jupyter container and getting the container up again.
The server freezes and requires restart
This might be happening on small server instances with low amount of RAM. To identify if this is a RAM issue, open a terminal on the server and run htop
command to view live display of memory usage.
Now repeat the operation which causes the server to freeze. If you see ram starting to increase before the server freezes, then consider adding Swap Space to your instance.
I always find this guide useful for debian systems https://www.cloudbooklet.com/how-to-add-swap-space-on-ubuntu-18-04-google-cloud/
After opening notebook: Unable to connect
Make sure you have added the websocket config in nginx config.