associate professor at Université Claude Bernard Lyon 1
co-head of a scientific information and communication training unit
The use of scientific programming in Python is still developing in the French human and social sciences fields. Indeed, not only is the R language more widely used, but many treatments, especially statistical ones, are based on proprietary software. However, computational approaches are gaining visibility, whether in the context of digital humanities, the processing of geolocalized data or the analysis of data from social networks.
It is important to facilitate the adoption of tools from the world of scientific programming to promote the reproducibility of results and the explicitness of computational analysis. Supporting good practices among disciplines and training of students and young researchers will require infrastructures but also the dissemination of examples of treatments both specific to the fields concerned. If the tools seem largely mature, the practical examples of use are still limited.
Within the framework of a project supported by the large digital research infrastructure in SHS Huma-Num around the deployment of a Jupyter Hub, we have developed five notebooks aiming not only to show possible uses but also to constitute reusable examples. The construction of these notebooks was based on a consultation of needs with a panel of potential users and in the identification of some priority needs during 2022.
In this communication, we propose to present the process of definition and realization of these five notebooks (analysis of a questionnaire; of state statistics; of Twitter data; training of a classification algorithm; detection of iconography in old journals) by insisting especially on the lessons learned on the strong points but also on the limits of the Notebooks as supports for the promotion of scientific programming. One of the limitations is the great fragmentation of research practices in the humanities and social sciences. In particular, during the design process, we were confronted with numerous choices concerning the degree of explicitness of the approach, the length of the adequate code, the standards to be respected for the meta-information or even the possible uses of such productions in a research and teaching context.
Beyond this experimentation, one of the perspectives of this work is to provide food for thought on the conditions of use of Notebooks at the interface between training and research activity. This concerns questions about the degree of modularity possible in a Notebook narrative framework, but also about the conditions for increasing the generality of specific treatments. More generally, it also raises the question of the conditions for setting up notebook collections for the training of students and researchers. We will discuss this transformation of a research project to a reproductible notebook on a specific case to evaluate its balance cost-benefits.