JupyterCon 2023

Distributed Data Science for Humans with Dask
05-11, 14:00–14:30 (Europe/Paris), Gaston Berger

Distributed computing is great! Unfortunately, distributed computing is also hard and often heavyweight. This friction gets in the way of the human+computer joint data exploration process that we value so dearly in the Jupyter ecosystem.

Dask is a popular library for parallel and distributed computing in Python that was co-developed alongside Jupyter with human interaction and interactivity in mind. In this talk we'll discuss Dask in the context of interactive data science, highlighting the ways in which Dask and Jupyter leverage each other to achieve a powerful and scalable user experience that fits easily into your hand. In particular we'll highlight rich notebook outputs, JupyterLab dashboard extensions, and JupyterHub deployment integrations, and how leveraging the extensibility of Jupyter can result in a first-class open source experience

Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled to improve Python's scalability with Dask for large organizations.

This speaker also appears in: