Background The Principles of Open Scholarly Infrastructure (POSI) provides a set of guidelines for operating open infrastructure in service to the scholarly community. It sets out 16 points to ensure that the infrastructure on which the scholarly and research communities rely is openly governed, sustainable, and replicable. Each POSI adopter regularly reviews progress, conducts periodic audits, and self-reports how they’re working towards each of the principles.
In 2020, Crossref’s board voted to adopt the Principles of Open Scholarly Infrastructure, and we completed our first self-audit.
In June 2022, we wrote a blog post “Rethinking staff travel, meetings, and events” outlining our new approach to staff travel, meetings, and events with the goal of not going back to ‘normal’ after the pandemic. We took into account three key areas:
The environment and climate change Inclusion Work/life balance We are aware that many of our members are also interested in minimizing their impacts on the environment, and we are overdue for an update on meeting our own commitments, so here goes our summary for the year 2023!
Metadata is one of the most important tools needed to communicate with each other about science and scholarship. It tells the story of research that travels throughout systems and subjects and even to future generations. We have metadata for organising and describing content, metadata for provenance and ownership information, and metadata is increasingly used as signals of trust.
Following our panel discussion on the same subject at the ALPSP University Press Redux conference in May 2024, in this post we explore the idea that metadata, once considered important mostly for discoverability, is now a vital element used for evidence and the integrity of the scholarly record.
For the third year in a row, Crossref hosted a roundtable on research integrity prior to the Frankfurt book fair. This year the event looked at Crossmark, our tool to display retractions and other post-publication updates to readers.
Since the start of 2024, we have been carrying out a consultation on Crossmark, gathering feedback and input from a range of members. The roundtable discussion was a chance to check and refine some of the conclusions we’ve come to, and gather more suggestions on the way forward.
We’ve just released an update to our participation report, which provides a view for our members into how they are each working towards best practices in open metadata. Prompted by some of the signatories and organizers of the Barcelona Declaration, which Crossref supports, and with the help of our friends at CWTS Leiden, we have fast-tracked the work to include an updated set of metadata best practices in participation reports for our members. The reports now give a more complete picture of each member’s activity.
What do we mean by ‘participation’?
Crossref runs open infrastructure to link research objects, entities, and actions, creating a lasting and reusable scholarly record. As a not-for-profit with over 20,000 members in 160 countries, we drive metadata exchange and support nearly 2 billion monthly API queries, facilitating global research communication.
To make this system work, members strive to provide as much metadata as possible through Crossref to ensure it is openly distributed throughout the scholarly ecosystem at scale rather than bilaterally, thereby realizing the collective benefit of membership. Together, our membership provides and uses a rich nexus of information— known as the research nexus—on which the community can build tools to help progress knowledge.
Each member commits to certain terms, such as keeping metadata current, updating links for their DOIs to redirect to, linking references and other objects, and preserving their content in perpetuity. Beyond this, we also encourage members to register as much rich metadata as is relevant and possible.
Creating and providing richer metadata is a key part of participation in Crossref; we’ve long encouraged a more complete scholarly record, such as through Metadata 20/20, and through supporting or leading initiatives for specific metadata, like open citations (I4OC), open abstracts (I4OA), open contributors (ORCID), and open affiliations (ROR).
Which metadata elements are considered best practices?
Alongside basic bibliographic metadata such as title, authors, and publication date(s), we encourage members to register metadata in the following fields:
References
A list of all the references used by a work. This is particularly relevant for journal articles but the references can include any type of object, including datasets, versions, preprints, and more. Additionally, we encourage these to be added into relationships, where relevant.
Abstracts
A description of the work. These are particularly useful for discovery systems that will promote the work, and are often used in downstream analyses such as for detecting integrity issues.
Contributor IDs (ORCID)
All authors should be included in a work’s metadata, ideally alongside their verified ORCID identifier.
Affiliations / Affiliation IDs (ROR)
Members are able to register contributor affiliations as free text, but we are encouraging everyone to add ROR IDs for affiliations as the recommended best practice, as this differentiates and avoids mistyping. These two fields have newly been added to the participation reports interface in the most recent update.
Funder IDs (OFR)
Acknowledging the organization(s) that funded the work. We encourage the inclusion of Open Funder Registry identifiers to make the funding metadata more usable. This will evolve into an additional use case for ROR over time.
Funding award numbers / Grant IDs (Crossref)
A number or identifier assigned by the funding organization to identify the specific award of funding or other support such as use of equipment or facilities, prizes, tuition, etc. The Crossref Grant Linking System includes a unique persistent link that can be connected with outputs, activities, people, and organizations.
Crossmark
The Crossmark service gives readers quick and easy access to the current status of a record, including any corrections, retractions, or updates, via a button embedded on PDFs or a web article. Openly adding corrections, retractions, and errata is critical part of publishing, and the button provides readers with an easy in-context alert.
Similarity Check URLs
The Similarity Check service helps editors to identify text-based plagiarism through our collective agreement for the membership to access to Turnitin’s powerful text comparison tool, iThenticate. Specific full-text links are required to participate in this service.
License URLs
URLs pointing to a license that explains the terms and conditions under which readers can access content. These links are crucial to denote intended downstream use.
Text mining URLs
Full-text URLs that help researchers in meta-science easily locate your content for text and data mining.
What is a participation report?
Participation reports are are a visualization of the data representing members’ participation to the scholarly record which is available via our open REST API. There’s a separate participation report for each member, and each report shows what percentage of that member’s metadata records include 11 key metadata elements. These key elements add context and richness, and help to open up members’ work to easier discovery and wider and more varied use. As a member, you can use participation reports to see for yourself where the gaps in your organization’s metadata are, and perhaps compare your performance to others. Participation reports are free and open to everyone - so you can also check the report for any other members you are interested in.
We first introduced participation reports in 2018. At the time, Anna Tolwinska and Kirsty Meddings wrote:
Metadata is at the heart of all our services. With a growing range of members participating in our community—often compiling or depositing metadata on behalf of each other—the need to educate and express obligations and best practice has increased. In addition, we’ve seen more and more researchers and tools making use of our APIs to harvest, analyze and re-purpose the metadata our members register, so we’ve been very aware of the need to be more explicit about what this metadata enables, why, how, and for whom.
All of that still rings true today. But as the research nexus continues to evolve, so should the tools that intend to reflect it. For example, in 2022, we removed the Open references field from participation reports after a board vote to change our policy and update the membership terms meant that all references deposited with Crossref would be open by default. And now we’ve expanded the list of fields again, adding coverage data for contributor affiliation text and ROR identifiers.
Putting it in practice
To find out how you measure up when it comes to participation, type the name of your member organization into the search box. You may be surprised by what you find—we often speak to members who thought they were registering a certain type of metadata for all their records, only to learn from their participation report that something is getting lost along the way.
You can only address gaps in your metadata if you know that they exist.
More information, as well as a breakdown of the now 11 key metadata elements listed in every participation report and tips on improving your scores, is available in our documentation.
And if you have any questions or feedback, come talk to us on the community forum or request a metadata Health Check by emailing the community team.