Login Sign up

Thursday Oct. 15, 2020, 4:30 p.m.–Oct. 15, 2020, 5 p.m. in Enterprise Jupyter Infrastructure

NotebookOps: A pattern for building notebook-centric data platforms

Vinayak Mehta

Audience level:
Intermediate

Brief Summary

In 2018, Netflix and PayPal wrote about how they set up powerful data platforms centered around Jupyter notebooks. This talk will look at the open-source components required for building such data platforms, illustrate how they all tie together, and reflect on some learnings from setting up a notebook-centric data platform at one of India's largest online grocery delivery companies.

Outline

Over the past few years, we've seen large organizations adopt Jupyter at scale to set up their internal DIY ("do it yourself") analytics notebook infrastructure. In 2018, Netflix and PayPal wrote about how they set up powerful data platforms centered around Jupyter notebooks to fuel experimentation and innovation at scale. In this talk, we'll look at the components required for building a notebook-centric data platform along with all the open-source tools involved, understand how the components tie together, and reflect on some learnings from setting up such a platform at one of India's largest online grocery delivery companies.

This talk is aimed at data engineers but it's also relevant to data analysts and data scientists. Basic knowledge of Python, Jupyter notebooks and the Jupyter ecosystem will be useful, but not required. After this talk, the audience will have an understanding of the open-source components that can help them build notebook-centric data platforms.

Outline:

The 2018 blog posts by Netflix and PayPal: