News
News
Photo by Jan Huber on Unsplash
The European Organization for Nuclear Research, known as CERN, has long been a driving force in scientific discovery and technological advancement. Beyond its groundbreaking research, CERN has also quietly championed open-source software for decades. But how to measure CERN’s impact on the global open-source community?
"Member states will appreciate that CERN is not only carrying out physics research but also contributes back through open source," says Axel Naumann, Chair of CERN's Open Source Program Office (OSPO). "That said, we've been scratching our heads on how to measure the impact. It's a non-trivial task, and while we lack statistically sound information about our user base, we can look at the code that’s been produced for insights."
A treasure hunt
The challenge is to gather as much of that code as possible. With over 2,200 stable employees and a shifting influx of 15,000-17,000 researchers, CERN is a nexus of scientific activity.
CERN’s community contributes to software projects at CERN, at their participating universities and institutes, and to projects maintained elsewhere and simply used at CERN. The contributors range from CERN employees to researchers visiting or working with CERN, from all around the globe. This complexity means focusing on CERN-hosted source code would miss many of the most interesting contributions.
The OSPO has engaged a student researcher to explore the world's largest open-source software archive with Software Heritage (SWH). The archive, which houses over 50 billion software artifacts secured by SWHID, offers unparalleled traceability across the entire software ecosystem. During the 12-month project, the student will be supported by experts from CERN's Scientific Information Service and the OSPO team.
"This partnership is not only a chance to document CERN's legacy," says Roberto Di Cosmo, Director of Software Heritage, "but also an opportunity to explore how open-source software accelerates scientific discovery and technological development worldwide."
By gathering and consolidating code scattered across various repositories over the years, what else does the OSPO team hope to learn? Naumann estimates that about 10% of the "official" open source code is produced by employees. That leaves the other 90% from visiting researchers who work together, devise solutions, finish projects, publish papers and code then move on. Some of these solutions, along with the code, go on to be widely adopted outside CERN, but so far remain unaccounted for and, essentially, unmapped.
"It's a bit of a treasure hunt," says Naumann, now a Senior Applied Physicist who has been working at CERN for 19 years. "I know of at least one or two of these unmapped projects, but how many others are there? How big is our impact, actually? That's what we're looking for and hoping to find."
This is where Software Heritage comes in. By using its archive and tools like the SWHID, the project aims to:
- Identify CERN-related projects: Unearth software projects that mention CERN or were developed by CERN-affiliated researchers.
- Track software lineage: Analyze how these projects have evolved over time, including forks, derivatives, and related contributions.
- Measure impact: Quantify the influence of CERN’s open-source software on the global community, its adoption in scientific research, and its broader contributions to technology.
Measuring impact for the future
Beyond CERN, this project also holds broader implications for other institutions and organizations grappling with the challenge of measuring their own open-source contributions.
This collaboration between CERN and Software Heritage is the beginning of a larger quest to map the intersection of science and software. As open-source software becomes an increasingly vital part of technological and scientific progress, understanding its impact is crucial but for society as a whole.
The project goes beyond CERN, Naumann says. "It's a creative solution for a problem that many people, many businesses, and many institutions have: 'How do we measure our open-source impact?' I think we've found a promising lead to answering this question."
CERN will make the project open source, so others can benefit from not only the findings but also the underlying analytical approach. This way, people can learn from their methods and potentially apply them to their own research.
Stay tuned for updates on the investigation into CERN's open-source contributions.
By Nicole Martinelli, editorial consultant
The collaboration has also made publicly available the software that it developed to search for the unique particle
As part of its continued commitment to making its science fully open, the CMS collaboration has just publicly released, in electronic format, the combination of CMS measurements that contributed to establishing the discovery of the Higgs boson in 2012. This release coincides with the publication of the Combine software – the statistical analysis tool that CMS developed during the first run of the Large Hadron Collider (LHC) to search for the unique particle, which has since been adopted throughout the collaboration.
Physics measurements based on data from the LHC are usually reported as a central value and its corresponding uncertainty. For instance, soon after observing the Higgs boson in LHC proton–proton collision data, CMS measured its mass as 125.3 plus or minus 0.6 GeV (the proton mass being about 1 GeV). But this figure is just a brief summary of the measurement outcome, a bit like the title of a book.
In a measurement, the full information extracted from the data is encoded in a mathematical function, known as the likelihood function, that includes the measured value of a quantity as well as its dependence on external factors. In the case of a CMS measurement, these factors encompass the calibration of the CMS detector, the accuracy of the CMS detector simulation used to facilitate the measurement and other systematic effects.
A likelihood function of a measurement based on LHC data can be complex, as many aspects need to be pinned down to fully understand the messy collisions that take place at the LHC. For example, the likelihood function of the combination of CMS Higgs boson discovery measurements, which CMS just released in electronic format, has nearly 700 parameters for a fixed value of the Higgs boson mass. Among these, only one – the number of Higgs bosons found in the data – is the physics parameter of interest, while the rest model systematic uncertainties.
Each of these parameters corresponds to a dimension of a multi-dimensional abstract space, in which the likelihood function can be drawn. It is hard for humans to visualise a space with more than a few dimensions, let alone one with many. The new release of the likelihood function of the CMS Higgs boson discovery measurements – the first likelihood function to be made publicly available by the collaboration – allows researchers to get around this problem. With a publicly accessible likelihood function, physicists outside the CMS collaboration can now precisely factor in the CMS Higgs boson discovery measurements in their studies.
The release of this likelihood function, as well as that of the Combine software, which is used to model the likelihood and fit the data, marks a new milestone in CMS’s decade-long commitment to fully open science. It joins hundreds of open-access publications, the release of almost five petabytes of CMS data on the CERN open-data portal and the publication of its entire software framework on GitHub.
Find out more on the CMS website.
CERN launches its Open Source Program Office to help you with the release of your software and hardware designs
Have you ever considered making your software or hardware designs publicly available? Sharing your work with collaborators in research and industry has many advantages, but it may also present some questions and challenges. To help you with all issues relating to the release of your software and hardware designs, we are launching CERN’s Open Source Program Office (OSPO).
In our community, it is common practice to publish open source software and hardware designs. By releasing your work under licences that allow others to use it, study its source code, redistribute it and share improvements, you can promote transparent and inclusive research practices. Given that all our research is a collaborative effort, open source is a common way of making our software and hardware accessible to everyone, allowing us to grow through contributions and new partners.
But how easy is it to publish open source designs? While there are many advantages to releasing open source software and hardware, it also presents challenges, such as addressing intellectual property rights by choosing the right licence. The effects of licence choices on future collaborations are not always obvious and must be carefully considered. Additionally, you may be confronted with technical challenges in ensuring that released material can be effectively used and modified by others.
Why an Open Source Program Office?
The OSPO will support you, whether you are a member of the personnel or a user, to find the best solution by giving you access to a set of best practices, tools and recommendations. With representatives from all sectors at CERN, it brings together a broad range of expertise on open source practices. If you would like to get in touch with the OSPO, you can contact us via Open.Source@cern.ch. As well as supporting the CERN internal community, the OSPO will engage with external partners to strengthen CERN’s role as a promoter of open source.
Open source is a key pillar of open science. By promoting open source practices, the OSPO thus seeks to address one of CERN’s core ambitions: sharing our knowledge with the world. Ultimately, the aim is to increase the reach of open source projects from CERN to maximise their benefits for the scientific community, industry and society at large.
We launch on 28 November – join us!
We are organising two events, on 28 and 29 November, to officially launch the OSPO. On the first day, we will host distinguished open source experts and advocates from Nvidia, the World Health Organization and the Open Source Hardware Association to discuss the impact and future of open source, followed by an aperitif. Seats are limited – please register in advance: https://cern.ch/ospo-1.
The second day will be dedicated to the role of the OSPO within CERN; the new office will be driven by engagement from the CERN community and will strive to meet its needs. We will briefly present the plans for the OSPO and listen to your ideas, questions, projects and concerns. Please join us on this occasion: https://cern.ch/ospo-2. You are welcome to submit your questions before the event on our forum: https://ospo.web.cern.ch/tag/opening-event.