Login Sign up

Thursday Oct. 15, 2020, 5:45 p.m.–Oct. 15, 2020, 6:15 p.m. in Jupyter Community: Tools

pydeck: High-scale geospatial visualization for Python

Andrew Duberstein

Audience level:

Brief Summary

With social graphs, genomics, and sensor data visualizations, data scientists often need to render massive spatial data sets. In 2018, Uber released pydeck, an open source Python library for rendering beautiful high scale data visualizations, built on top of Uber's deck.gl library. We'll go over how pydeck was written and how to use pydeck to visualize large-scale data sets. See more at pydeck.gl.


After this talk, you'll know what your options are for mapping geospatial data, what pydeck is, and how its Jupyter integration works. Attendees are expected to know Python scripting, Pandas, and Jupyter. Ideal attendees have experience in data science, data analytics, or machine learning engineering, and have a passion for geospatial data.


What can we do currently with geospatial visualization libraries in Python?

Generally, we can view data on a map from a variety of Python-based tools, like folium or even matplotlib.

However, most are capped by scale and flexibility. Solutions relying on native graphics usually lack interactivity, which is crucial for exploratory data analysis and data-driven storytelling. Solutions relying on the browser are often limited by the number of DOM elements a browser can handle. Nearly all solutions lack full-featured Jupyter integration.

What do deck.gl and pydeck add to the Python data visualization ecosystem?

The JavaScript library deck.gl is a high performance visualization library with a focus on spatial data. Succinctly, pydeck works as a thin wrapper around deck.gl, bringing WebGL access and the deck.gl ecosystem to Python, with the added benefit of Jupyter integration. The pydeck library can render hundreds of thousands of rows of data, in order to support rendering LiDAR or GPS points for fleets with only a few lines of code. It can be used for charts and network graphs, as well, having no issue with hundreds of thousands of nodes and edges.

We'll cover a bunch of demos and a bit of Jupyter-specific discussion. I'll discuss the two way data interaction within Jupyter environments–selecting data in a visualization for manipulation within code. "How do we build a Jupyter widget?" is itself a lengthy topic but I'll do a brief digression on this. I'll also present a smorgasbord of demos supporting high volume updates in Python, complex map views, etc. You can find all of these at the pydeck homepage.

How will pydeck grow?

Lastly, I'll cover the future of pydeck, which is still a fairly young library and benefits from community GitHub issues and PRs. Expect further incorporation of other vis.gl ecosystem tools and much else.