JupyterCon 2023

Visualizing live data pipelines in JupyterLab with ipydagred3
05-10, 14:00–14:55 (Europe/Paris), Room 1

Dataflow graphs have become indispensable tools for data science, from ETL batch processes to live streaming data pipelines. A variety of tools exist for constructing and scheduling graphs, but few generic tools exist for visualizing them, and even fewer let you analyze and interact with them from inside a notebook.

In this talk, we will discuss ipydagred3, a Jupyter Widgets wrapper around the dagre graph layout engine and the popular charting framework D3.js. We will use this framework to visualize a variety of static graphs, then interact with these graphs my pushing mutations from both python and javascript. We will then build a real-world example using a popular streaming dataflow framework, and show how ipydagred3 can integrate to provide a performant, intuitive interface to the underlying graph engine.

Audience: Jupyter - Novice, familiarity with any graph engine recommended e.g. Apache Airflow, Dask, networkx, etc

I am a quantitative developer at Cubist Systematic Strategies and an adjunct professor in the Computer Science Department at Columbia University. My background is primarily in application development with a focus on streaming analytics. I have been involved in the formation of several corporate open source efforts, and am a proud member and maintainer of open source projects in the FINOS, JupyterLab, and Conda Forge organizations.