Jupyter and JupyterHub¶
Jupyter Notebooks are executable documents containing: formatted text, formatted maths, chunks of code plus figures, tables and textual output generated by that code.
Notebooks can be used:
to develop and execute linear data analysis workflows;
to present linear workflows to others;
as runnable documentation for software packages (e.g. NetworkX examples);
as executable teaching materials (e.g. BAD days tutorials);
Jupyter itself is web application that interprets, runs and renders Notebooks. You interact with it by As you interact with it by just connecting from your web browser the Jupyter server software can be running on your local machine or a remote server (which may have more memory, CPU cores and/or GPUs than your local machine).
Log in to the JupyterHub web inteface,
Specify what resources (memory, CPU cores, GPUs) they want for a Jupyter session,
Start and run a Jupyter Notebook server on a worker node in the cluster using these resources.
Using Jupyter on ShARC¶
- 1. Connecting to JupyterHub, requesting resources (RAM, processors, GPUs) for your Jupyter session, and starting your session
- 2. Jupyter file browser
- 3. Terminal in your browser
- 4. Programming languages, software packages and execution environments
- 5. Creating, editing and running Jupyter Notebooks
- 6. Monitoring and controlling your Jupyter session
- 7. Errors and troubleshooting
Status of and maintenance of ShARC’s JupyterHub service¶
This service is currenty experimental. If you use this service and encounter a problem, please contact email@example.com.
The server that provider the JupyterHub service is typically rebooted at 03:26 on the 2nd Tuesday of the month to install security updates.
JupyterHub on a Grid Engine cluster: internal workings¶
The hub of JupyterHub has several components:
an authenticator that allows users to log in, possibly using externally-stored identity information;
a database of user and state information;
a spawner that can start single-user Jupyter Notebook servers on demand.
There is also a web proxy that first routes web connections from a given user to the hub for authentication and possibly choosing spawner options then, after a single-user Jupyter server has been spawned, certain web connections are forwarded to the Jupyter Notebook server. From the user’s perspective it appears that they are interacting with a single web application, even though at times they might be talking to a single-user Jupyter server that running on a different machine to the Hub.
ShARC uses a custom spawner, sgespawner, that spawns single-user Jupyter servers on one or more worker nodes on ShARC by submitting batch jobs to the Grid Engine job scheduler.
The JupyterHub and
sgespawner configuration allows the user to specify the Grid Engine resources required for the Jupyter session in advance
via a web form then these resources are requested as part of the batch job submission.
Further details of how JupyterHub and
sgespawner are configured on ShARC
can be found in this repository.
The JupyterHub service on ShARC is currently developed and maintained by the University’s Research Software Engineering team. This work has been funded by OpenDreamKit, a Horizon2020 European Research Infrastructure project (676541) that aims to advance the open source computational mathematics ecosystem.