Blog

Citation Typing Ontology

I was happy to read David Shotton’s recent Learned Publishing article, Semantic Publishing: The Coming Revolution in scientific journal publishing, and see that he and his team have drafted a Citation Typing Ontology.*

Anybody who has seen me speak at conferences knows that I often like to proselytize about the concept of the “typed link”, a notion that hypertext pioneer, Randy Trigg, discussed extensively in his 1983 Ph.D. thesis.. Basically, Trigg points out something that should be fairly obvious- a citation (i.e. “a link”) is not always a “vote” in favor of the thing being cited.
In fact, there are all sorts of reasons that an author might want to cite something. They might be elaborating on the item cited, they might be critiquing the item cited, they might even be trying to refute the item cited (For an exhaustive and entertaining survey of the use and abuse of citations in the humanities, Anthony Grafton‘s, The Footnote: A Curious History, is a rich source of examples)
Unfortunately, the naive assumption that a citation is tantamount to a vote of confidence has become inshrined in everything from the way in which we measure scholarly reputation, to the way in which we fund universities and the way in which search engines rank their results. The distorting affect of this assumption is profound. If nothing else, it leads to a perverse situation in which people will often discuss books, articles, and blog postings that they disagree with without actually citing the relevant content, just so that they can avoid inadvertently conferring “wuffie” on the item being discussed. This can’t be right.
Having said that, there has been a half-hearted attempt to introduce a gross level of link typology with the introduction of the “nofollow” link attribute- an initiative started by Google in order to try to address the increasing problem of “Spamdexing”. But this is a pretty ham-fisted form of link typing- particularly in the way it is implemented by the Wikipedia where Crossref DOI links to formally published scholarly literature have a “nofollow” attribute attached to them but, inexplicably, items with a PMID are not so hobbled (view the HTML source of this page, for example). Essentially, this means that, the Wikipedia is a black-hole of reputation. That is, it absorbs reputation (through links too the Wikipedia), but it doesn’t let reputation back out again. Hell, I feel dirty for even linking to it here ;-).
Anyway, scholarly publishers should certainly read Shotton’s article because it is full of good, and practical ideas about what can can be done with today’s technology in order to help us move beyond the “digital incunabula” that the industry is currently churning out. The sample semantic article that Shotton’s team created is inspirational and I particularly encourage people to look at the source file for the ontology-enhanced bibliography which reveals just how much more useful metadata can be associated with the humble citation.
And now I wonder whether CiteULike, Connotea, 2Collab or Zotero will consider adding support for the CItation Typing Ontology into their respective services?
* Disclosure:
a) I am on the editorial board of Learned Publishing
b) Crossref has consulted with David Shotton on the subject of semantically enhancing journal articles

An interview about “Author IDs”

Geoffrey Bilder

Geoffrey Bilder – 2009 February 19

In Identifiers

Over the past few months there seems to have been a sharp upturn in general interest around implementing an “author identifier” system for the scholarly community. This, in turn, has meant that more people have been getting in touch with us about our nascent “Contributor ID” project. The other day, after seeing my comments in the above thread, Martin Fenner asked if he could interview me about the issue of author identifiers for his blog on Nature Networks, Gobbledygook. I agreed and he posted the interview the other day.

CURIE Syntax 1.0

Tony Hammond

Tony Hammond – 2009 January 19

In Identifiers

The W3C has recently (Jan. 16) released CURIE Syntax 1.0 as a Candidate Recommendation and is inviting implementations.

(Note that I made a fuller post here on CURIEs and erroneously confused the Editor’s Draft (Oct. 23, ’08) as being a Candidate Recommendation. Well, at least it’s got there now.)

Standard InChI Defined

Tony Hammond

Tony Hammond – 2009 January 17

In IdentifiersInChI

IUPAC has just released the final version (1.02) of its InChI software, which generates Standard InChIs and Standard InChIKeys. (InChI is the IUPAC International Chemical Identifier.)

The Standard InChI “removes options for properties such as tautomerism and stereoconfiguration”, so that a molecule will always generate the same stable identifier - a unique InChI - which facilitates “interoperability/compatibility between large databases/web searching and information exchange”. Note also that any “shortcomings in Standard InChI may be addressed using non-Standard InChI (currently obtainable using InChI version 1.02beta)”.

CURIEs - A Cure for URIs

Tony Hammond

Tony Hammond – 2008 December 03

In Identifiers

A quick straw poll of a few folks at London Online yesterday revealed that they had not heard of CURIE’s. And there was I thinking that most everybody must have heard of them by now. 🙂 So anyway here’s something brief by way of explanation.

CURIE stands for Compact URI and does the signal job or rendering long and difficult to read URI strings into something more manageable. (URIs do have the particular gift of being “human transcribable” but in practice their length and the actual characters used in the URI strings tend to muddy things for the reader.) So given that the Web is built upon a bedrock of URIs, anything that then makes URIs easier to handle is going to be an important contributor to our overall ease of interaction with the Web.

(Continues)

Five Years

Tony Hammond

Tony Hammond – 2008 July 28

In Identifiers

Oh wow! A rather remarkable plea here from Dan Brickley on the public-lod mailing list which calls for the registrant of the dbpedia.org DNS entry to top it up with another 5+ years worth of clocktime. Some quotes:

_“The idea of such a cool RDF namespace having only 6 months left on the DNS registration gives me the worries.”

“If you could add another 5-10 years to the DNS registration I’d sleep easier at night.”

Tombstone

Tony Hammond

Tony Hammond – 2008 May 23

In Identifiers

So, the big guns have decided that XRI is out. In a message from the TAG yesterday, variously noted as being “categorical” (Andy Powell, eFoundations) and a “proclamation” (Edd Dumbill, XML.com), the co-chairs (Tim Berners-Lee and Stuart Williams) had this to say:

“We are not satisfied that XRIs provide functionality not readily available from http: URIs. Accordingly the TAG recommends against taking the XRI specifications forward, or supporting the use of XRIs as identifiers in other specifications.”

NIH Mandate and PMCIDs

Ed Pentz

Ed Pentz – 2008 April 15

In Identifiers

The NIH Public Access Policy says “When citing their NIH-funded articles in NIH applications, proposals or progress reports, authors must include the PubMed Central reference number for each article” and the FAQ provides some examples of this:

Examples:

Varmus H, Klausner R, Zerhouni E, Acharya T, Daar A, Singer P. 2003. PUBLIC HEALTH: Grand Challenges in Global Health. Science 302(5644): 398-399. PMCID: 243493

Zerhouni, EA. (2003) A New Vision for the National Institutes of Health. Journal of Biomedicine and Biotechnology (3), 159-160. PMCID: 400215

ISO/CD 26324 (DOI)

Tony Hammond

Tony Hammond – 2008 February 22

In Identifiers

Following on from my previous post about prism:doi I didn’t mention, or reference, the ongoing ISO work on DOI, Indeed I hadn’t realized that the DOI site now has a status update on the ISO work:

_“The DOI¼ System is currently being standardised through ISO. It is expected that the process will be finalised during 2008. In December 2007 the Working Group for this project approved a final draft as a Committee Draft (standard for voting) which is now being processed by ISO. Copies of the Committee Draft (SC9N475) and an accompanying explanatory document detailing issues dealt with during the standards process (SC9N474) are provided here for information.

BISG Paper on Identifying Digital Book Content

Ed Pentz

Ed Pentz – 2008 January 14

In Identifiers

BISG and BIC have published a discussion paper called “The identification of digital book content” - https://web.archive.org/web/20090920075334/http://www.bisg.org/docs/DigitalIdentifiers_07Jan08.pdf. The paper discusses ISBN, ISTC and DOI amongst other things and makes a series of recommendations which basically say to consider applying DOI, ISBN and ISTC to digital book content. The paper highlights in a positive way that DOI and ISBN are different but can work together (the idea of the “actionable ISBN” and aiding discovery of content). However, it doesn’t go into much depth on any of the issues or really explain how all these identifiers would work together and the critical role that metadata plays.