Blog

 2 minute read.

Free public data file of 112+ million Crossref records

A lot of people have been using our public, open APIs to collect data that might be related to COVID-19. This is great and we encourage it. We also want to make it easier. To that end we have made a free data file of the public elements from Crossref’s 112.5 million metadata records.

The file (65GB, in JSON format) is available via Academic Torrents here: https://doi-org.turing.library.northwestern.edu/10.13003/83B2GP

It is important to note that Crossref metadata is always openly available. The difference here is that we’ve done the time-saving work of putting all of the records registered through March 2020 into one file for download.

The sheer number of records means that, though anyone can use these records anytime, downloading them all via our APIs can be quite time-consuming. We hope this saves the research community valuable time during this crisis.

A few important notes

  • All records are included. In other words, the data file has every DOI ever registered with Crossref through March 31st, 2020. This means it’s a large file, 65GB.

    • Metadata is supplied by our members and, as such, not all records have the same completeness (or quality) of metadata. Bibliographic metadata is generally required. All other metadata, e.g. license and funding information, ORCIDs, etc. is optional (though very much encouraged).
    • References (i.e. authors’ cited sources) are also optional metadata. Nearly 50 million records include references and, of those, nearly 30 million have open references that are included in the data file. “Limited” and “Closed” references are not included in the data file. [EDIT 6th June 2022 - all references are now open by default with the March 2022 board vote to remove any restrictions on reference distribution].
    • If an error in the metadata is found, please report it directly to the publisher to correct.
  • The records are in JSON.

  • New and updated records can be added incrementally using our REST API, which includes a number of date filter options, e.g. index-date.

  • No registration is required to use our REST API but we do strongly encourage being a ‘polite’ (i.e. identified) user. It makes troubleshooting much easier and reduces the chance of negatively impacting other users.

Questions, comments and feedback are welcome at support@crossref.org.

We thank AcademicTorrents.com for helping us make this data available. And we are grateful for the incredible efforts of everyone working to support research everywhere–stay safe and well.

Further reading

Page owner: Jennifer Kemp   |   Last updated 2020-April-09