In this talk, Tam will show you how to transform a Jupyter notebook into a Kedro pipeline. He will convert built-in functions from a Jupyter notebook like the one from the (Berkley National History Museum)[https://github.com/BNHM/jupyter/blob/master/IB105-2017-BNHM.ipynb] into modular pipelines showcasing preprocessing workflows.
The proposal aims to teach users how to convert a simple Jupyter notebook into a Kedro pipeline. Most data science and data engineering models are built using Jupyter notebooks. While notebooks are great for an exploratory workflow, they are challenging to debug and difficult to maintain. When we want to scale analytics, we use Kedro. Kedro is an open-source Python library that implements software engineering best-practices for data and machine-learning pipelines. Kedro helps data scientists structure an exploratory data science workflow while creating production-ready code. Some users describe it as the React or Django of Data Science.
In this talk, Tam will show you how to transform a Jupyter notebook into a Kedro pipeline. He will convert built-in functions from a Jupyter Notebook like the one from the (Berkley National History Museum)[https://github.com/BNHM/jupyter/blob/master/IB105-2017-BNHM.ipynb] into modular pipelines showcasing preprocessing workflows. Tam will guide you through understanding the powerful features in Kedro such as modular programming, extracting Notebook cell functions as Pipeline tasks and data abstraction. This talk is designed for Data Scientists and Data Engineers from the novice level. One does not require to have previous knowledge of Kedro to enjoy this talk.