JupyterCon 2023

Synchronizing the data science workflow with data management at scale
05-11, 14:40–14:50 (Europe/Paris), Poster Placeholder

Reminder - The date and time of this sessions are placeholder due to the limitation of the conference software.
All Poster will be presented during the Poster Session from 6:30 to 8:00 on Thus 11th,

https://cfp.jupytercon.com/2023/talk/AKPRE8/

--

As high-throughput imaging technology has advanced, researchers have been able to acquire increasingly large and multi-dimensional image datasets, which yield even more complex and high-volume derived analytical results. This poses a challenge for data management, image analysis, and data science workflows. The open source platform OMERO, and its commercial counterpart OMERO Plus, offer enterprise-level data management for bioimaging data and associated metadata, including rich image analysis results.

While OMERO Plus offers various data mining interfaces within its browser-based clients, the data science workflow is inherently flexible and bespoke, while still relying on domain-standard tools. Therefore, we have built an integrated data science environment in OMERO Plus via a Jupyter extension, such that the approved data science libraries are already installed and usable alongside the open OMERO API to retrieve and analyse pixels, metadata, and tabular data from the OMERO data management platform and other custom sources of data. Further, standard Jupyter notebooks and data dashboards provide templates for those just getting started with their imaging and data science. Finally, these integrations use approved compute resources and rely on institutionally-defined security profiles for optimised operational control.

We will demonstrate the scalable integration of our data management platform with interactive data analysis and visualisation tools for an enterprise environment.

David Stirling is a data scientist at Glencoe Software. He specialises in producing image analysis workflows and tooling to assist researchers in quantifying and exploring image-based data. He is also interested in integrating popular open source tools with the OMERO ecosystem to provide user-friendly connectivity between packages. David previously worked in the Cimini Lab within the Broad Institute of MIT and Harvard, where he contributed to the CellProfiler image analysis software package. He also produced popular plugins which allow users to make use of powerful AI packages such as Cellpose and StarDist from within this software.

Emil Rozbicki is the Head of Applications for Glencoe Software. He is an expert in bioimage data management, visualisation, and analysis at scale. Emil is a physicist by training, prior to joining Glencoe he worked at University of Dundee investigating mechanisms underlying early-stage avian embryo development. During this time, he designed and built the first light sheet microscope in the UK and built analysis routines for investigation of cell behavior at the full organism scale during early-stage development. Currently he is focused on building solutions for the management and analysis of large scale bioimage datasets especially in high content and multiplexed imaging domains for some of the largest academic and biopharma organizations in the world.