JupyterCon 2023

How can we convince French human and social sciences researchers of the relevance of notebooks for scientific programming ?
05-10, 15:30–16:00 (Europe/Paris), Louis Armand 1

The use of scientific programming in Python is still developing in the French human and social sciences fields. Indeed, not only is the R language more widely used, but many treatments, especially statistical ones, are based on proprietary software. However, computational approaches are gaining visibility, whether in the context of digital humanities, the processing of geolocalized data or the analysis of data from social networks.

It is important to facilitate the adoption of tools from the world of scientific programming to promote the reproducibility of results and the explicitness of computational analysis. Supporting good practices among disciplines and training of students and young researchers will require infrastructures but also the dissemination of examples of treatments both specific to the fields concerned. If the tools seem largely mature, the practical examples of use are still limited.

Within the framework of a project supported by the large digital research infrastructure in SHS Huma-Num around the deployment of a Jupyter Hub, we have developed five notebooks aiming not only to show possible uses but also to constitute reusable examples. The construction of these notebooks was based on a consultation of needs with a panel of potential users and in the identification of some priority needs during 2022.

In this communication, we propose to present the process of definition and realization of these five notebooks (analysis of a questionnaire; of state statistics; of Twitter data; training of a classification algorithm; detection of iconography in old journals) by insisting especially on the lessons learned on the strong points but also on the limits of the Notebooks as supports for the promotion of scientific programming. One of the limitations is the great fragmentation of research practices in the humanities and social sciences. In particular, during the design process, we were confronted with numerous choices concerning the degree of explicitness of the approach, the length of the adequate code, the standards to be respected for the meta-information or even the possible uses of such productions in a research and teaching context.

Beyond this experimentation, one of the perspectives of this work is to provide food for thought on the conditions of use of Notebooks at the interface between training and research activity. This concerns questions about the degree of modularity possible in a Notebook narrative framework, but also about the conditions for increasing the generality of specific treatments. More generally, it also raises the question of the conditions for setting up notebook collections for the training of students and researchers. We will discuss this transformation of a research project to a reproductible notebook on a specific case to evaluate its balance cost-benefits.

Antoine is a senior consultant at Datactivist, a French cooperative and participatory company whose mission is to open data and make them used and useful. Operating at all steps of data opening and reuse, Datactivist works with both data producers and data re-users. He advises research organizations and funders that engage with open science and open their research data.

Antoine was trained in agricultural sciences as well as science and technology studies (STS). Since 2005, he has worked at the crossroads of public engagement with science and digital tools for Café des sciences, Deuxième labo, University of Bordeaux, and Datactivist. His main achievements include the French network of science bloggers; the Manifesto for an emancipating, self-critical and responsible scientific mediation; the first Science Hack Day organized in France; activism for open data on research funding; the award-winning Hacketafac student digital innovation contest; and a report for the INOS Erasmus+ project on open innovation activities in higher education.

associate professor at Université Claude Bernard Lyon 1
co-head of a scientific information and communication training unit

Data scientist, researcher (NLP) and senior consultant (open data).