At PayPal, notebooks is used as a unified platform for both data and ML workflows. This talk covers how MLFlow is integrated with Jupyter notebooks to provide model tracking, versioning, deployment and serving to provide Data Science Workbench experience. We will also talk about Distributed Training to get better GPU resource utilization and faster model training.
PayPal Notebooks powered by Jupyter is a major ecosystem for data analytics, data science, Machine Learning and exploration at PayPal, with kernels, magics, and utilities for analytics and engineering. PayPal uses Jupyter notebooks integrated with MLflow to provide data science workbench experience by enabling model/experiments tracking, registry, sharing, version management, deployment and Serving. We also enabled distributed model training on GPUs to better utilizing resources and improving model training. Jun Hua and Hariraj Sundaravadivelu will explain how they built a seamless integration of Notebooks to MLflow and other key capabilities like distributed training by abstracting out the underlying implementation to provide data scientists a seamless experience.
Prerequisite knowledge: A basic understanding of the Jupyter ecosystem, Docker, Kubernetes & GPU.