JupyterCon 2023

Addressing global sustainabilty challenges with Jupyter and cloud-based geospatial data platforms
05-11, 16:50–17:05 (Europe/Paris), Poster Placeholder

There is widespread recognition that the Earth's climate is being changed due to human activity. The world's inhabitants (including humans, animals, and other organisms) are already experiencing many negative impacts, such as displacement from sea-level rise, ecosystem degradation, and increases in extreme weather events. People need access to better information so they can understand the impacts that are already occurring, predict what changes will occur in the future, mitigate the changes we can prevent from occurring, and prepare to adapt to those changes that we cannot mitigate. We have a global grand challenge of making living on Earth sustainable.

Fortunately, in recent decades there has been a rapid increase in data ability about the state of the Earth. Earth observing satellites have been continuously providing data for over 50 years, constantly improving in frequency and resolution. Earth system models generate weather predictions of the coming weeks and climate projections of what may happen in the next few decades. And an increasing number of data providers are making this data openly accessible.

However, there are numerous barriers that prevent decision makers (whether individuals, businesses or governments) from utilizing these data.

  1. The data volume/velocity is a major challenge for most potential users.
  2. The data need to be processed into actionable information that is appropriate for those making decisions. This critical work of designing processing workflows is being done by scientific researchers and data analysts, who need to have both scientific understanding and access to technologies that help them develop implement those workflows.
    We need to lower the barriers so that more people can participate in addressing global sustainability challenges.

These barriers are being addressed through the use of data-proximate computing systems that are optimized for geospatial data analysis. These systems move compute close to data storage, and are improvement over the historical practice (and bottleneck) of downloading large datasets for local analysis. Some examples of these systems include:

  • Pangeo a open source software stack consisting of Jupyter, Dask, Kubernetes.
  • Earth Engine Google's cloud-based geospatial analysis system, including a >50 PB data catalog.

A remaining challenge is how to improve the navigate, exploration, and analysis of data residing in these systems so that hypotheses can be rapidly tested, workflows can be prototyped, and applications can be rapidly deployed to less technical decision makers.

This presentation will cover:

  • Any overview of global sustainability challenges and how cloud-based geospatial tools that can be used to address them.
  • How Jupyter technologies simplify working with petabyte scale datasets including:
  • earthengine-jupyter - a Python package for working with Google Earth Engine from within a Jupyter notebook.
  • Examples of analyzing 1000's of satellite images and decades of climate data.

Dr. Tyler A. Erickson is a developer advocate at Google. In this role, he fosters collaborations with researchers from academia, NGO’s, and governmental organizations seeking to capitalize on Earth Engine’s capabilities for geospatial analyses that involve immense satellite and model-based datasets. Dr. Erickson leads the development of Earth Engine’s core efforts in water and climate, guides the evolution of Earth Engine to support these scientific domains, and leads support efforts for the Earth Engine Python API. A snow hydrologist by training, he has degrees in civil & environmental engineering and geography from Colorado State University, California Institute of Technology, Stanford, and the University of Colorado at Boulder. Tyler is a longtime Python programmer and open source contributor, particularly on OSGeo, NumFocus, and Jupyter projects.