JupyterCon 2023

Per-cell Remote Execution on Jupyter
05-11, 16:00–16:10 (Europe/Paris), Poster Placeholder

Audience level: Novice
- Everyone who knows Project Jupyter

Introduction

JupyterLab is an IDE that is loved by many in the fields of data science and machine learning. Jupyter provides an outstanding, interactive feature that allows the REPL based execution and review of cell-level code, and facilitates data exploration and machine learning experiments. It is used by many including students and experts who apply Jupyter for their work.
Data science and machine learning code generally require large amounts of computing. Operating these code on personal laptops or local environments may require excessive amounts of time or fail to run successfully due to a memory shortage. These issues can be resolved by installing JupyterLab on a high computing power workstation and and access it via port forwarding, or deploying it on a Kubernetes cluster. Using a remote workstation’s JupyterHub or JupyterLab can lead to issues on the shared resources. If the IPython kernel connected to the Jupyter notebook is not terminated, resources, such as the memory and the GPU, will not be returned. This means that other users of the workstation will not be able to use those resources when they need to.
We thought of new ways to execute code remotely on JupyterLab while avoiding these issues. We were able to implement a remote execution feature that allows codes to run on remote environments per the user request. Link allows each pipeline component (i.e. each Jupyter cell) to run either locally or on a designated remote environment. Moreover, the resources used for the execution is returned automatically, leading to a more efficient shared resource management. In next section of this note, we will explain the design of Link’s remote execution feature.

Remote exectution on Link

Link pipeline consists of one or more components, and each component corresponds to one Jupyter cell. Each component has properties, and properties contain information from the local or remote environment. Depending on the execution information, components can be executed in an independent environment. Link executes code in a specific environment according to the user request and returns the resources used. As a result, users can efficiently use and manage the shared resources of workstation.

Figure 1: Design of per-cell remote execution

Per-cell remote execution is designed and composed of a message queue, data store, and remote worker as shown in Figure-1. Local Link and remote Link workers communicate with each other through message queue and data storage. The message queue manages running tasks, and the storage stores data such as code and objects. Remote execution of each component operates in the following process.

  1. Serialize and transfer the selected cell’s code and parent cells’ data to the remote worker via the message queue and data storage.
  2. Remote worker receives the task from the message queue, deserializes the code and data from the data storage and executes the code.
  3. Execution results and the output data is serialized and transferred to the local environment via the message queue and the data storage.
  4. The local environment receives the results from the message queue and imports the output data from the data storage

Figure 2: Add a remote worker

Figure 3: Select components to execute remotely

Link can connect to a remote worker using the message queue and data storage access information. Users can register with an easy-to-understand alias. After successfully connecting to the remote worker, users can select certain components to run on this worker, and the selected components will be executed remotely. This information is available even when the computer is turned off and on again, even after several days.

Summary

JupyterLab is an IDE loved by many developers ranging from junior students to experts in the fields of data science and machine learning. Data science and machine learning code require large amounts of computing, and executing these codes in individual local environments may require a lot of time or may fail due to a lack of memory. These issues can be overcome by installing and JupyterLab on a high compute workstation and utilizing that environment. However, using a remote JupyterLab may lead to shared resources not being returned correctly, leading to problems in using these shared resources among different users. In order to avoid these problems, we have implemented the remote feature to run only parts of the code on the remote environment, as requested by the user. Link allows user to designate and run respective components (i.e. cells) on either local or remote environments. Link enhances efficiency even further by automatically returning the shared resources upon the execution of the code.

A software engineer who does something meaningful
Work at MakinaRocks where we develop MLOps products called "Link" and "Runway"

I'm eager to enable machine learning to have a real-world impact.

GitHub: https://github.com/hunhoon21

This speaker also appears in: