Login Sign up

Thursday Oct. 15, 2020, 5:15 p.m.–Oct. 15, 2020, 5:45 p.m. in Data Science Applications

High performance Jupyter: faster workloads with Dask and RAPIDS

Aaron Richter

Audience level:
Intermediate

Brief Summary

What happens when you run out of processing power on your laptop? You could scale up - get more efficient hardware, or scale out - add more machines. Whichever you choose, there are great tools for accomplishing scale within the Jupyter ecosystem. This talk presents Dask and RAPIDS for parallel and GPU computing, and how to launch and manage clusters all within JupyterLab.

Outline

What happens when you run out of processing power on your laptop? You could scale up - get more efficient hardware, or scale out - add more machines. Whichever you choose, there are great tools for accomplishing scale with the Python and Jupyter ecosystem. Dask is a parallel computing framework that scales from your laptop to a cluster of thousands of machines. RAPIDS is a GPU-computing framework that pushes traditional CPU workloads to the GPU. Dask and RAPIDS together allow you to scale both up and out! This talk will help you navigate this exciting new world, and show how easy it is to get your workloads running faster in Jupyter.

Outline

Prerequisites: a working knowledge of data science with Python (pandas, numpy, scikit-learn, etc.). No cluster computing experience necessary - this is what you will learn from the talk!