JupyterHub is an excellent platform for shared computational environments. Dask allows researchers to scale computations past the limits of their laptops. Unfortunately, deploying and maintaining a Dask+JupyterHub cluster for teams is a very difficult task. We demonstrate a new open source project QHub that automates the deployment and day-to-day maintenance of JupyterHub on multiple clouds.
In this talk, we will demonstrate an open source project QHub that Quansight is developing through infrastructure as code to automate the deployment and day-to-day maintenance of jupyterhub on cloud providers such as AWS, GCP, and Digital Ocean.
Zero to jupyterhub is a tremendous resource that served as our initial inspiration for deploying data science compute environments. We have since learned and deployed jupyterhub compute environments both internally and for several customers. The goal of this work is to provide small institutions and groups the ability to provision and manage a jupyterhub cluster in a cost effective manner. This cluster will give teams a scalable dask compute environment without the need for significant sysadmin or devops experience.
Prior to this work, we believe Zero to jupyterhub was the best resource for instructions on how to deploy jupyterhub on several cloud providers. From our experience, these manual steps make it especially hard to maintain multiple environments (such as development and production) and new features become slower to implement and document. Each of these components have important orderings, and this tedious nature encourages permissive credentials for each user. All of this takes developer time; we wish to reduce that burden and empower users not as familiar with cloud providers and associated tools the ability to make infrastructure and development environment changes.
The core feature of QHub is using Terraform, Helm, and Github Actions to provide automated infrastructure as code that allows continuous deployment to AWS, GCP, and Digital Ocean. Since the infrastructure state is reflected in a repository, it allows self-documenting of infrastructure and team members to submit pull requests that can be reviewed before modifying the infrastructure. In addition to users controlling their own local conda environments, we have found this model especially helpful when a team member would like a new package available system-wide for all users. Cloud providers currently integrate well with kubernetes and provide auto-scaling. Our project takes advantage of this by allowing users to request on-demand dask clusters. These clusters can be scheduled on arbitrary compute resources such as high memory, high cpu, and gpu instances in a cost effective manner using kubernetes node groups. Additional features for teams include: shared file systems between users, configurable conda environments, and credential management for each user. Ultimately, this tool will enable the deployment of a scalable cloud-agnostic compute environment suitable for teams.