RE: [metadata] FYI: BIBTEX Update at the LoC from Bill Kasdorf on 2013-12-08 (public-digipub-ig@w3.org from December 2013)

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Sun, 8 Dec 2013 21:41:24 +0000
To: Robert Sanderson <azaroth42@gmail.com>
CC: Ivan Herman <ivan@w3.org>, Tim Clark <tim_clark@harvard.edu>, "W3C Digital Publishing IG" <public-digipub-ig@w3.org>
Message-ID: <6d7399e7725242bda76d7247d670dd54@CO2PR06MB572.namprd06.prod.outlook.com>
All true, but I am still a big proponent of the DOI. ;-)

Characterizing it as based on a "resolving site" makes it sound like some website or server that could go down at any moment. It's based on the Handle system and it is an incredibly robust, global infrastructure. So yes, that's where the "actionable" part of DOI being an actionable identifier comes from, but I can attest that from the point of view of scholarly publishing it has had a revolutionary impact, and it works.

Yes, it is more for human redirection than anything else; that too is part of its power. The flexibility it provides publishers is the key to why it has worked so well for them. True, it may or may not take you directly to the content; that's not its purpose. Its purpose is to provide a persistent link to whatever the party controlling the DOI wants the link to resolve to. And to enable that to change whenever necessary, without the DOI itself ever needing to change. That's a feature, not a bug! ;-)

It also enables each RA (Registration Agency) to provide whatever services are appropriate for its particular needs and functions. CrossRef is by far the biggest RA, but they are not the only RA. The EU Publications Office is an RA; they use the DOI in a completely different way. The DOI is used in the entertainment industry-again, in a way that suits those needs. CrossRef provides, fundamentally, cross-publisher linking services, and related services like plagiarism detection and identification of the latest version of an article (including alerting people to articles that have been retracted), all of which are very valuable to scholarly publishing. That's all built on the nature and flexibility of the DOI.

So again, your points are all completely valid and correct, and I understand why you take the point of view that you do.

But I have seen this provide such incredible value to scholarly publishing that I can't help but be a big fan of it. For what it is designed to do, and for how CrossRef has implemented it, it has been and continues to be hugely successful and an enormous benefit to the scholarly publishing industry, publishers, libraries, and scholars/researchers alike.

--Bill K

From: Robert Sanderson [mailto:azaroth42@gmail.com]
Sent: Sunday, December 08, 2013 12:09 PM
To: Bill Kasdorf
Cc: Ivan Herman; Tim Clark; W3C Digital Publishing IG
Subject: Re: [metadata] FYI: BIBTEX Update at the LoC


Hi Bill, all,

I'd like to stress the *today* part of the *essential*ness.

DOIs solve, to a certain extent, the persistence of URIs in scholarly communication by introducing a central resolving site (dx.doi.org<http://dx.doi.org>) that takes the DOI and redirects you to the publisher's site.  Which is great ... until dx.doi.org<http://dx.doi.org> goes down and we have gazillions of broken links.  It's simply kicking the can down the road a bit, rather than providing a web platform level solution.

As Crossref clearly won't give our their cash cow (the rules and mappings for the resolution), when they do go away, we'll be left with an *awful* mess on our hands.

And I'd also like to stress that it takes you to /something/ not necessarily the content.  This makes it worthless for (e.g.) use as the identity for an Annotation's target.  You don't know whether you'll end up at an HTML splash page, a 403 forbidden, a PDF or any other representation.  There are also DOIs that no longer resolve to anything (404), and DOIs that publishers create that aren't recognized by the resolver (we ran into several at Wiley last week in this category).  There's also no differentiation made between the content and metadata about the content -- you can (poorly) do content negotiation for various metadata formats on the same URI as you use to get the ... whatever it is that the DOI resolves to for you at that point in time.

DOI is basically a slightly more refined bit.ly<http://bit.ly> or other URL shortener and has exactly the same issues that relying on one of them would have.  The "essentialness" is entirely political rather than technical -- publishers aren't good at persistent URIs and have less reason to invest in the web architecture itself as DOI is currently solving their problem for them.

I'm being very negative in this thread, I realize, but Bibframe and DOI are in my opinion two options to be avoided at all costs rather than embraced.

Rob


On Sat, Dec 7, 2013 at 5:09 AM, Bill Kasdorf <bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>> wrote:

Sure.



>From the CrossRef website (http://www.crossref.org/):



>CrossRef is an association of scholarly publishers that develops shared infrastructure to support more effective scholarly communications. Our citation-linking network today covers over 60 million journal articles and other content items (books chapters, data, theses, technical reports) from thousands of scholarly and professional publishers around the globe.



>CrossRef has a critical mass of more than 2.8 million book DOIs registered. This means that there are about 138,000 titles from more than 90 publishers available for reference linking



CrossRef and the DOI are _totally essential_ to scholarly publishing today. Indispensible. Virtually universally depended on. I can't stress this strongly enough.



The CrossRef DOI is in virtually EVERY scholarly journal reference published today, and increasingly in book references. I wasn't kidding when I said gazillions. It's how the system works. The reason they are dominant in journal articles and not in books is that virtually all journal articles are online and few books are. Journal articles can be linked to directly, which is what the CrossRef DOI enables.



I'm a big advocate of DOIs for books. (I wrote a paper on this for CrossRef a couple of years ago.) BTW the DOI does not necessarily have to take you to the _content_ (as it typically does for journal articles); the publisher controls (and can change) the URL it points to, so book publishers can point to where you can _obtain_ the book (or chapter)-including what is called "Multiple Resolution" (e.g. to resolve to a menu that lets you choose to get the book from Apple, Amazon, B&N, Kobo, or the publisher directly-you pick).



--Bill Kasdorf



-----Original Message-----
From: Ivan Herman [mailto:ivan@w3.org<mailto:ivan@w3.org>]
Sent: Saturday, December 07, 2013 4:58 AM
To: Bill Kasdorf
Cc: Tim Clark; W3C Digital Publishing IG
Subject: Re: [metadata] FYI: BIBTEX Update at the LoC



Interesting. As a (former) scholarly researcher, ie, publisher, I did not meet this CrossRef directly, nor is it usual to use reference to crossref in scholarly references or in systems like Mendelay or Zotero. Can you send some references around?



Is this also relevant for scholarly book publishing?



Ivan



On 07 Dec 2013, at 10:53 , Bill Kasdorf <bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>> wrote:



> The key issue re bibliographic metadata for scientific journal

> publishing is CrossRef metadata and the DOI, which provide

> cross-publisher linking and other services (identification of most

> recent version via CrossMark, plagiarism detection, etc.). It is

> essential and ubiquitous in the scholarly journal space, and now

> increasingly used for scholarly books (CrossRef already has millions

> of book DOIs, at both the title and chapter level . . . and a

> gazillion, maybe a gazillion and a half, journal DOIs). These CrossRef

> DOIs appear in most citations of journal articles, and some publishers

> refresh their citations frequently to capture newly registered

> articles that are cited in already-published articles that didn't have

> DOIs when those articles were originally published. This is important

> for both the publishing and library worlds. CrossRef has a basic set

> of required metadata that enables DOI registration and link

> resolution, and accommodates much more metadata than the required

> minimum.-Bill Kasdorf

>

> From: Tim Clark [mailto:tim_clark@harvard.edu]

> Sent: Thursday, December 05, 2013 8:09 AM

> To: Ivan Herman

> Cc: W3C Digital Publishing IG

> Subject: Re: [metadata] FYI: BIBTEX Update at the LoC

>

> Agree this effort  is entirely and importantly relevant, and there are others such as CiTO the citation ontology,  as well.  I actually don't see any particular separation - there is a minimum an intersection.

>

> If you look at scientific journal publishing, what is the difference between bibliographic info at publisher's website and at for example, NLM (National Library of Medicine)?

>

> NLM has in addition to the "pure" bibliographic metadata, a lot of search-oriented stuff like MeSH terms; the abstracts; and interesting sort of "hidden" metadata like "most similar to what other publications".

>

> No doubt publishers have a lot of process-oriented metadata, and there is likely other stuff I know nothing about.  But at least there is an important intersection set between libraries and publishers. Front matter of books always have ISBN, LOC or Brit Lib catalog number, etc. and you can expand out on common stuff from there.

>

> Tim Clark

>

> Director, Biomedical Informatics Core, Massachusetts General Hospital

> Instructor in Neurology, Harvard Medical School

>

>

>

> On Dec 5, 2013, at 7:44 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

>

>

> I am not sure this is directly relevant to the Metadata Task Force discussion, but it may be of interest nevertheless:

>

> http://www.loc.gov/bibframe/media/updateforum-nov22-2013.html

>

> contains a fairly long video on LoC's BIBTEX initiative. Yes, it is library metadata, not publishers' metadata, but I guess one of the challenges in general is how to bring those together.

>

> Eric Miller, who is one of the developers (and, actually, who led the Semantic Web Activity at W3C until 2007) makes a very high level case for the usage of a BIBTEX-like structure (starting around 49:00 in the video). His talk lacks technical details for my taste, but I guess that was the nature of the audience...

>

> Ivan

>

> ----

> Ivan Herman, W3C

> Digital Publishing Activity Lead

> Home: http://www.w3.org/People/Ivan/

> mobile: +31-641044153<tel:%2B31-641044153>

> GPG: 0x343F1A3D

> FOAF: http://www.ivan-herman.net/foaf

>

>

>

>

>

>

> The information in this e-mail is intended only for the person to whom

> it is addressed. If you believe this e-mail was sent to you in error

> and the e-mail contains patient information, please contact the

> Partners Compliance HelpLine at http://www.partners.org/complianceline

> . If the e-mail was sent to you in error but does not contain patient

> information, please contact the sender and properly dispose of the e-mail.

>





----

Ivan Herman, W3C

Digital Publishing Activity Lead

Home: http://www.w3.org/People/Ivan/

mobile: +31-641044153<tel:%2B31-641044153>

GPG: 0x343F1A3D

FOAF: http://www.ivan-herman.net/foaf
Received on Sunday, 8 December 2013 21:42:11 UTC