The Crossref Grant Linking System (GLS) has been facilitating the registration, sharing and re-use of open funding metadata for six years now, and we have reached some important milestones recently! What started as an interest in identifying funders through the Open Funder Registry evolved to a more nuanced and comprehensive way to share and re-use open funding data systematically. That’s how, in collaboration with the funding community, the Crossref Grant Linking System was developed. Open funding metadata is fundamental for the transparency and integrity of the research endeavour, so we are happy to see them included in the Research Nexus.
Lots of exciting innovations are being made in scientific publishing, often raising fundamental questions about established publishing practices. In this guest post, Ludo Waltman and André Brasil discuss the recently launched MetaROR publish-review-curate platform and the questions it raises about good practices for Crossref DOI registration in this emerging landscape.
Crossref and the Public Knowledge Project (PKP) have been working closely together for many years, sharing resources and supporting our overlapping communities of organisations involved in communicating research. Now we’re delighted to share that we have agreed on a new set of objectives for our partnership, centred on further development of the tools that our shared community relies upon, as well as building capacity to enable richer metadata registration for organisations using the Open Journal Systems (OJS).
To mark Crossref’s 25th anniversary, we launched our first Metadata Awards to highlight members with the best metadata practices.
GigaScience Press, based in Hong Kong, was the leader among small publishers, defined as organisations with less than USD 1 million in publishing revenue or expenses. We spoke with Scott Edmunds, Ph.D., Editor-in-Chief at GigaScience Press, about how discoverability drives their high metadata standards.
What motivates your organisation/team to work towards high-quality metadata? What objectives does it support for your organisation?
Our objective is to communicate science openly and collaboratively, without barriers, to solve problems in a data- and evidence-driven manner through Open Science publishing. High-quality metadata helps us address these objectives by improving the discoverability, transparency, and provenance of the work we publish. It is an integral part of the FAIR principles and UNESCO Open Science Recommendation, playing a role in increasing the accessibility of research for both humans and machines. As one of the authors of the FAIR principles paper and an advisor of the Make Data Count project, I’ve also personally been very conscious to practice what I preach.
If you take a peek at our blog, you’ll notice that metadata and community are the most frequently used categories. This is not a coincidence – community is central to everything we do at Crossref. Our first-ever Metadata Sprint was a natural step in strengthening both. Cue fanfare!. And what better way of celebrating 25 years of Crossref?
We designed the Crossref Metadata Sprint as a relatively short event where people can form teams and tackle short problems. What kind of problems? While we expected many to involve coding, teams also explored documenting, translating, researching—anything that taps into our open, member-curated metadata. Our motivation behind this format was to create a space for networking, collaboration, and feedback, centered on co-creation using the scholarly metadata from our REST API, the Public Data File, and other sources.
What have we learned in planning
The journey towards the event was filled with valuable lessons and learnings from our community. Our initial call received submissions from 71 people, which was exciting but presented the first challenge: we felt our event would work better with a relatively smaller group. An additional challenge we faced was the enthusiasm from people from different regions of the world who were eager to join, but needed support to attend in person. It reminded us how global our community is, and how important it is to think about different ways of making participation possible, especially in future events.
We also wanted to make sure that participation wasn’t limited by technical background. The selection process included a preliminary review by several members of our team to bring in a mix of perspectives and reduce bias. The event welcomed participants from all kinds of expertise levels, including colleagues who had never worked with APIs before. We sought to provide common ground for all with several group calls, where we presented introductions to our tools and used the opportunity to collect requests about tools, specific data, and questions from the participants that could enhance their preparation during the sprint.
At the Crossref Metadata Sprint
I’ve recently stumbled upon the following quote from a recognized data scientist:
Numbers have an important story to tell. They rely on you to give them a clear and convincing voice. (Stephen Few) 1
It made me think that we can replace numbers for metadata and the idea still holds. Surrounded by the paleontological collections of the National Museum of Natural History, on 8th of April in Madrid, 21 participants and 5 Crossref staff came together to work on twelve different projects. These ranged from improvements to our Public Data file formats and exploring metadata completeness, to tackling multilingual metadata challenges, understanding citation impact for retracted works, and connecting Retraction Watch metadata with other knowledge graphs metadata.
The different teams that participated in the first Crossref Metadata Sprint.
The initial hours were the most energetic (but not chaotic!) as most of the participants had the chance to interact in person for the first time, ideas were exchanged, and pre-formed groups became more stable (however, one of the advantages of the format is that teams don't have to be rigid). Twelve coffee- and tea-powered projects started taking shape, a few of which are part of larger ideas under development. By the end of the second day, we saw:
Author changes between preprints and published articles.
Coverage of funding information by publisher.
Enriching citations with Crossref metadata.
Funding metadata completeness.
Improvement to the Public Data File.
Interoperability between Crossref DOIs and hash-based identifiers.
University of Tetova’s metadata coverage.
Retraction Watch data mash-up.
Perspective about AI-driven multilingual metadata.
Public Data File in Google Big Query.
Visibility of retractions across citations.
Visualising Crossref geographic member data.
Our team worked as part of some of these projects, providing valuable insights and feedback to the participants. We ended the first session with a group dinner and re-energised for the second day, which started with everybody fully immersed in their tasks. As we approached the conclusion, the groups started preparing some quick slides for a short presentation (that you can find here).
Our team and the participants left excited and looking forward to the next opportunity to collaborate. We certainly see the potential of recreating these spaces, and we’ll work on future editions in a different location. All of the project summaries and notes will remain stored in our metadata sprint Gitlab repo. Would you like to know more about any of these ideas? Let us know in the comments.
The first Crossref Metadata Sprint in a nutshell
Participants
None of this would’ve been possible without our enthusiastic participants. Huge thanks to everyone! Here is the full list of those who attended our inaugural Sprint: