Documentation

DOI crawler report

We test a broad sample of DOIs to ensure resolution. For each journal crawled, a sample of DOIs that equals 5% of the total DOIs for the journal up to a maximum of 50 DOIs is selected. The selected DOIs span prefixes and issues.

The results are recorded in crawler reports, which you can access from the depositor report expanded view. If a title has been crawled, the last crawl date is shown in the appropriate column. Crawled DOIs that generate errors will appear as a bold link:

DOI crawler report

Click Last Crawl Date to view a crawler status report for a title:

Crawler status report for a title

The crawler status report lists the following:

  • Total DOIs: Total number of DOI names for the title in system on last crawl date
  • Checked: number of DOIs crawled
  • Confirmed: crawler found both DOI and article title on landing page
  • Semi-confirmed: crawler found either the DOI or the article title on the landing page
  • Not Confirmed: crawler did not find DOI nor article title on landing page
  • Bad: page contains known phrases indicating article is not available (for example, article not found, no longer available)
  • Login Page: crawler is prompted to log in, no article title or DOI
  • Exception: indicates error in crawler code
  • httpCode: resolution attempt results in error (such as 400, 403, 404, 500)
  • httpFailure: http server connection failed

Select each number to view details. Select re-crawl and enter an email address to crawl again.

Page owner: Isaac Farley   |   Last updated 2020-April-08