Re: [metadata] FYI: BIBTEX Update at the LoC from Robert Sanderson on 2013-12-08 (public-digipub-ig@w3.org from December 2013)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Sun, 8 Dec 2013 10:08:36 -0700
To: Bill Kasdorf <bkasdorf@apexcovantage.com>
Cc: Ivan Herman <ivan@w3.org>, Tim Clark <tim_clark@harvard.edu>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CABevsUEDpTJvtbnxOAQVPUuED8fX0Q5_CbHRr1JyqkBcX6nuWQ@mail.gmail.com>
Hi Bill, all,

I'd like to stress the *today* part of the *essential*ness.

DOIs solve, to a certain extent, the persistence of URIs in scholarly
communication by introducing a central resolving site (dx.doi.org) that
takes the DOI and redirects you to the publisher's site.  Which is great
... until dx.doi.org goes down and we have gazillions of broken links.
 It's simply kicking the can down the road a bit, rather than providing a
web platform level solution.

As Crossref clearly won't give our their cash cow (the rules and mappings
for the resolution), when they do go away, we'll be left with an *awful*
mess on our hands.

And I'd also like to stress that it takes you to /something/ not
necessarily the content.  This makes it worthless for (e.g.) use as the
identity for an Annotation's target.  You don't know whether you'll end up
at an HTML splash page, a 403 forbidden, a PDF or any other representation.
 There are also DOIs that no longer resolve to anything (404), and DOIs
that publishers create that aren't recognized by the resolver (we ran into
several at Wiley last week in this category).  There's also no
differentiation made between the content and metadata about the content --
you can (poorly) do content negotiation for various metadata formats on the
same URI as you use to get the ... whatever it is that the DOI resolves to
for you at that point in time.

DOI is basically a slightly more refined bit.ly or other URL shortener and
has exactly the same issues that relying on one of them would have.  The
"essentialness" is entirely political rather than technical -- publishers
aren't good at persistent URIs and have less reason to invest in the web
architecture itself as DOI is currently solving their problem for them.

I'm being very negative in this thread, I realize, but Bibframe and DOI are
in my opinion two options to be avoided at all costs rather than embraced.

Rob



On Sat, Dec 7, 2013 at 5:09 AM, Bill Kasdorf <bkasdorf@apexcovantage.com>wrote:

>  Sure.
>
>
>
> From the CrossRef website (http://www.crossref.org/):
>
>
>
> >CrossRef is an association of scholarly publishers that develops shared
> infrastructure to support more effective scholarly communications. Our
> citation-linking network today covers *over 60 million journal articles
> and other content items (books chapters, data, theses, technical reports)*from thousands of scholarly and professional publishers around the globe.
>
>
>
> >CrossRef has a critical mass of more than *2.8 million book DOIs *registered.
> This means that there are about 138,000 titles from more than 90 publishers
> available for reference linking
>
>
>
> CrossRef and the DOI are _*totally essential*_ to scholarly publishing
> today. Indispensible. Virtually universally depended on. I can’t stress
> this strongly enough.
>
>
>
> The CrossRef DOI is in virtually EVERY scholarly journal reference
> published today, and increasingly in book references. I wasn’t kidding when
> I said gazillions. It’s how the system works. The reason they are dominant
> in journal articles and not in books is that virtually all journal articles
> are online and few books are. Journal articles can be linked to directly,
> which is what the CrossRef DOI enables.
>
>
>
> I’m a big advocate of DOIs for books. (I wrote a paper on this for
> CrossRef a couple of years ago.) BTW the DOI does not necessarily have to
> take you to the _*content*_ (as it typically does for journal articles);
> the publisher controls (and can change) the URL it points to, so book
> publishers can point to where you can _*obtain*_ the book (or
> chapter)—including what is called “Multiple Resolution” (e.g. to resolve to
> a menu that lets you choose to get the book from Apple, Amazon, B&N, Kobo,
> or the publisher directly—you pick).
>
>
>
> --Bill Kasdorf
>
>
>
> -----Original Message-----
> From: Ivan Herman [mailto:ivan@w3.org]
> Sent: Saturday, December 07, 2013 4:58 AM
> To: Bill Kasdorf
> Cc: Tim Clark; W3C Digital Publishing IG
> Subject: Re: [metadata] FYI: BIBTEX Update at the LoC
>
>
>
> Interesting. As a (former) scholarly researcher, ie, publisher, I did not
> meet this CrossRef directly, nor is it usual to use reference to crossref
> in scholarly references or in systems like Mendelay or Zotero. Can you send
> some references around?
>
>
>
> Is this also relevant for scholarly book publishing?
>
>
>
> Ivan
>
>
>
> On 07 Dec 2013, at 10:53 , Bill Kasdorf <bkasdorf@apexcovantage.com>
> wrote:
>
>
>
> > The key issue re bibliographic metadata for scientific journal
>
> > publishing is CrossRef metadata and the DOI, which provide
>
> > cross-publisher linking and other services (identification of most
>
> > recent version via CrossMark, plagiarism detection, etc.). It is
>
> > essential and ubiquitous in the scholarly journal space, and now
>
> > increasingly used for scholarly books (CrossRef already has millions
>
> > of book DOIs, at both the title and chapter level . . . and a
>
> > gazillion, maybe a gazillion and a half, journal DOIs). These CrossRef
>
> > DOIs appear in most citations of journal articles, and some publishers
>
> > refresh their citations frequently to capture newly registered
>
> > articles that are cited in already-published articles that didn’t have
>
> > DOIs when those articles were originally published. This is important
>
> > for both the publishing and library worlds. CrossRef has a basic set
>
> > of required metadata that enables DOI registration and link
>
> > resolution, and accommodates much more metadata than the required
>
> > minimum.—Bill Kasdorf
>
> >
>
> > From: Tim Clark [mailto:tim_clark@harvard.edu <tim_clark@harvard.edu>]
>
> > Sent: Thursday, December 05, 2013 8:09 AM
>
> > To: Ivan Herman
>
> > Cc: W3C Digital Publishing IG
>
> > Subject: Re: [metadata] FYI: BIBTEX Update at the LoC
>
> >
>
> > Agree this effort  is entirely and importantly relevant, and there are
> others such as CiTO the citation ontology,  as well.  I actually don't see
> any particular separation - there is a minimum an intersection.
>
> >
>
> > If you look at scientific journal publishing, what is the difference
> between bibliographic info at publisher's website and at for example, NLM
> (National Library of Medicine)?
>
> >
>
> > NLM has in addition to the "pure" bibliographic metadata, a lot of
> search-oriented stuff like MeSH terms; the abstracts; and interesting sort
> of "hidden" metadata like "most similar to what other publications".
>
> >
>
> > No doubt publishers have a lot of process-oriented metadata, and there
> is likely other stuff I know nothing about.  But at least there is an
> important intersection set between libraries and publishers. Front matter
> of books always have ISBN, LOC or Brit Lib catalog number, etc. and you can
> expand out on common stuff from there.
>
> >
>
> > Tim Clark
>
> >
>
> > Director, Biomedical Informatics Core, Massachusetts General Hospital
>
> > Instructor in Neurology, Harvard Medical School
>
> >
>
> >
>
> >
>
> > On Dec 5, 2013, at 7:44 AM, Ivan Herman <ivan@w3.org> wrote:
>
> >
>
> >
>
> > I am not sure this is directly relevant to the Metadata Task Force
> discussion, but it may be of interest nevertheless:
>
> >
>
> > http://www.loc.gov/bibframe/media/updateforum-nov22-2013.html
>
> >
>
> > contains a fairly long video on LoC's BIBTEX initiative. Yes, it is
> library metadata, not publishers' metadata, but I guess one of the
> challenges in general is how to bring those together.
>
> >
>
> > Eric Miller, who is one of the developers (and, actually, who led the
> Semantic Web Activity at W3C until 2007) makes a very high level case for
> the usage of a BIBTEX-like structure (starting around 49:00 in the video).
> His talk lacks technical details for my taste, but I guess that was the
> nature of the audience...
>
> >
>
> > Ivan
>
> >
>
> > ----
>
> > Ivan Herman, W3C
>
> > Digital Publishing Activity Lead
>
> > Home: http://www.w3.org/People/Ivan/
>
> > mobile: +31-641044153
>
> > GPG: 0x343F1A3D
>
> > FOAF: http://www.ivan-herman.net/foaf
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > The information in this e-mail is intended only for the person to whom
>
> > it is addressed. If you believe this e-mail was sent to you in error
>
> > and the e-mail contains patient information, please contact the
>
> > Partners Compliance HelpLine at http://www.partners.org/complianceline
>
> > . If the e-mail was sent to you in error but does not contain patient
>
> > information, please contact the sender and properly dispose of the
> e-mail.
>
> >
>
>
>
>
>
> ----
>
> Ivan Herman, W3C
>
> Digital Publishing Activity Lead
>
> Home: http://www.w3.org/People/Ivan/
>
> mobile: +31-641044153
>
> GPG: 0x343F1A3D
>
> FOAF: http://www.ivan-herman.net/foaf
>
>
>
>
>
>
>
>
>
>
>
Received on Sunday, 8 December 2013 17:09:06 UTC