Login Sign up

Analyzing the use and reproducibility of Jupyter Notebooks using ReproduceMeGit

Sheeba Samuel

Audience level:

Brief Summary

We introduce ReproduceMeGit, an online visualization tool to help users analyze the use and reproducibility of Jupyter Notebooks available in a GitHub repository. It provides information on how notebooks are executed by users, notebooks that were successfully reproducible, those with different results from the original notebooks, etc. It also offers direct access to Binder and ProvBook.


Jupyter Notebooks are widely used in science, industry, and education. They combine code, text, visualizations, and results, making it easy to share and publish them for reproducible research. Millions of notebooks are shared in GitHub, which currently has more than 100 million repositories. There are several best practices in writing and sharing Jupyter Notebooks to improve the reuse of Jupyter Notebooks. In this talk, we present ReproduceMeGit, which analyzes the use and reproducibility of Jupyter Notebooks available in Github repositories. It provides a wide variety of analysis on Jupyter Notebooks. It includes analysis on how users execute notebooks. It gives information on the notebooks that were successfully reproducible and the notebooks that resulted in exceptions. It offers a detailed analysis on why the notebooks failed to execute. It compares the results from the original notebook with the results from reproducing the same by ReproduceMeGit. It also provides direct access to Binder. The provenance information with the execution history can also be downloaded through ReproduceMeGit. It makes all of these features available through a user-friendly interface.