If you take a peek at our blog, you’ll notice that metadata and community are the most frequently used categories. This is not a coincidence – ommunity is central to everything we do at Crossref. Our first-ever Metadata Sprint was a natural step in strengthening both. Cue fanfare!. And what better way of celebrating 25 years of Crossref?
We designed the Crossref Metadata Sprint as a relatively short event where people can form teams and tackle short problems. What kind of problems? While we expected many to involve coding, teams also explored documenting, translating, researching—anything that taps into our open, member-curated metadata. Our motivation behind this format was to create a space for networking, collaboration, and feedback, centered on co-creation using the scholarly metadata from our REST API, the Public Data File, and other sources.
Dado que Crossref celebra su 25º aniversario este año, nos gustaría destacar algunas de las regiones activas y comprometidas en nuestra comunidad global.
Durante los primeros 25 años, la composición de los miembros de Crossref ha evolucionado significativamente. De un puñado de grandes editoriales fundadoras, ahora tenemos más de 22.000 miembros de 160 países. Casi dos tercios de ellos se identifican como universidades, bibliotecas, entidades gubernamentales, fundaciones, editoriales académicas, e institutos de investigación.
The Crossref Nominating Committee invites expressions of interest to join the Board of Directors of Crossref for the term starting in January 2026. The committee will gather responses from those interested and create the slate of candidates that our membership will vote on in an election in September.
Expressions of interest will be due Monday, June 9th, 2025
If you take a peek at our blog, you’ll notice that metadata and community are the most frequently used categories. This is not a coincidence – ommunity is central to everything we do at Crossref. Our first-ever Metadata Sprint was a natural step in strengthening both. Cue fanfare!. And what better way of celebrating 25 years of Crossref?
We designed the Crossref Metadata Sprint as a relatively short event where people can form teams and tackle short problems. What kind of problems? While we expected many to involve coding, teams also explored documenting, translating, researching—anything that taps into our open, member-curated metadata. Our motivation behind this format was to create a space for networking, collaboration, and feedback, centered on co-creation using the scholarly metadata from our REST API, the Public Data File, and other sources.
What have we learned in planning
The journey towards the event was filled with valuable lessons and learnings from our community. Our initial call received submissions from 71 people, which was exciting but presented the first challenge: we felt our event would work better with a relatively smaller group. An additional challenge we faced was the enthusiasm from people from different regions of the world who were eager to join, but needed support to attend in person. It reminded us how global our community is, and how important it is to think about different ways of making participation possible, especially in future events.
We also wanted to make sure that participation wasn’t limited by technical background. The selection process included a preliminary review by several members of our team to bring in a mix of perspectives and reduce bias. The event welcomed participants from all kinds of expertise levels, including colleagues who had never worked with APIs before. We sought to provide common ground for all with several group calls, where we presented introductions to our tools and used the opportunity to collect requests about tools, specific data, and questions from the participants that could enhance their preparation during the sprint.
At the Crossref Metadata Sprint
I’ve recently stumbled upon the following quote from a recognized data scientist:
Numbers have an important story to tell. They rely on you to give them a clear and convincing voice. (Stephen Few) 1
It made me think that we can replace numbers for metadata and the idea still holds. Surrounded by the paleontological collections of the National Museum of Natural History, on 8th of April in Madrid, 21 participants and 5 Crossref staff came together to work on twelve different projects. These ranged from improvements to our Public Data file formats and exploring metadata completeness, to tackling multilingual metadata challenges, understanding citation impact for retracted works, and connecting Retraction Watch metadata with other knowledge graphs metadata.
The different teams that participated in the first Crossref Metadata Sprint.
The initial hours were the most energetic (but not chaotic!) as most of the participants had the chance to interact in person for the first time, ideas were exchanged, and pre-formed groups became more stable (however, one of the advantages of the format is that teams don't have to be rigid). Twelve coffee- and tea-powered projects started taking shape, a few of which are part of larger ideas under development. By the end of the second day, we saw:
Author changes between preprints and published articles.
Coverage of funding information by publisher.
Enriching citations with Crossref metadata.
Funding metadata completeness.
Improvement to the Public Data File.
Interoperability between Crossref DOIs and hash-based identifiers.
University of Tetova’s metadata coverage.
Retraction Watch data mash-up.
Perspective about AI-driven multilingual metadata.
Public Data File in Google Big Query.
Visibility of retractions across citations.
Visualising Crossref geographic member data.
Our team worked as part of some of these projects, providing valuable insights and feedback to the participants. We ended the first session with a group dinner and re-energised for the second day, which started with everybody fully immersed in their tasks. As we approached the conclusion, the groups started preparing some quick slides for a short presentation (that you can find here).
Our team and the participants left excited and looking forward to the next opportunity to collaborate. We certainly see the potential of recreating these spaces, and we’ll work on future editions in a different location. All of the project summaries and notes will remain stored in our metadata sprint Gitlab repo. Would you like to know more about any of these ideas? Let us know in the comments.
The first Crossref Metadata Sprint in a nutshell
Participants
None of this would’ve been possible without our enthusiastic participants. Huge thanks to everyone! Here is the full list of those who attended our inaugural Sprint: