Login Sign up

Monday Oct. 12, 2020, 4 p.m.–Oct. 12, 2020, 4:30 p.m. in Jupyter in Education

Using the Jupyterverse to power MADS

Damian Avila

Audience level:

Brief Summary

We designed a scalable system to help instructors create the content the students use to learn. This robust system automatically generates docker-based customized environments and nbgrader-based autograders. This is also coupled with a heavily customized Kubernetes-based JupyterHub deployment where instructors can develop content in a fully standardized and rich environment.


Multiple tools from the Jupyter Ecosystem have been adopted widely in multiple educational settings to support the students' learning processes.

Recently, the University of Michigan launched an online Masters of Applied Data Science Degree (MADS) in partnership with Coursera (https://www.coursera.org/degrees/master-of-applied-data-science-umich), with more than 30 applied for-credit courses focusing on different aspects of the Data Science experience.

We, as the Tech team, faced the problem to design and implement a scalable and standardized system so instructors (faculty and teaching assistants) could develop content including lectures, programming assignments, and corresponding autograders.

Since we are working with a LMS platform partner providing the computational resources the students use, we saw the opportunity to create a decoupled workflow/system where instructors could create the content in a rich, customizable but standardized environment. A system able to create artifacts containing the specification of the environment and the content the instructors have developed. A system that once the artifacts have been deployed, it could provide the course-customized environment the students will face and run the autogenerated autograders once the students submit their programming assignments. We saw the opportunity to create this hybrid/decoupled system as a modular infrastructure that not only allows us to interact with the current LMS but also could be easily adapted to other LMSs or, even, other degrees.

To design and implement the system filling those requirements, we have leveraged several tools from the Jupyterverse (aka as the Jupyter Ecosystem):

1) Course Development Environment: To provide a rich, customizable but standardized environment for the faculty and TAs to create the content, we use our own heavily customized Kubernetes-based JupyterHub deployment on AWS. We integrate our JupyterHub with Github repositories, so we have persistence, traceability, and collaboration. Faculty and TAs can choose the course they are working on, launch the most recent image associated with that course, and develop content using standard software engineering methodologies and tools.

2) Customized Course Deployments: Each course has the possibility to be based on a custom set of system specifications described through docker files. We provide a default image inherited from the Jupyter Docker Stack image with MADS specific enhancements, and all the automatically generated ones inherit from the default one so we are sure we provide a (familiar) similar experience among all our images, but ultimate flexibility is left to the content creator to define what they need.

3) Common Grading Framework: To provide a unified and standardized experience at the time to create programming assignments, we heavily use the nbgrader project. Choosing one general standard tool to create the content helps in terms of scalability and knowledge transfer among the (often transient part-time student) TA population. To provide automatically generated autograders, we also use nbgrader in a sort of standalone mode. That means a decoupled way to use nbgrader to create the content in one place and being able to run the autograding process in another different place, but keeping things in sync. To achieve that, we have developed a CI system that, among other things, is able to create docker-based and nbgrader-based autogenerated autograders that once deployed they can successfully grade the students' submissions.

With these design decisions we have been able to deploy a robust system that can scale to create hundreds of different courses (supporting thousands of students) with minimal effort and maintenance burden to provide a Jupyter-based experience consistent across a whole degree. Further, the use of open and well-defined software components (git, Docker, Jupyter) allows us to create both experiences which can be used both within our partner LMS platform (Coursera) as well as other platforms as needed.