Re: DOI and other identifiers from Robert Sanderson on 2016-05-09 (public-annotation@w3.org from May 2016)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Mon, 9 May 2016 08:54:00 -0700
To: Dan Whaley <dwhaley@hypothes.is>
Cc: Web Annotation <public-annotation@w3.org>
Message-ID: <CABevsUFKXqEVf=3_09oLsMZ8hgf-Dj_NqtBuEVc+md+EALNmSA@mail.gmail.com>
On Sat, May 7, 2016 at 11:04 AM, Dan Whaley <dwhaley@hypothes.is> wrote:

> On Fri, May 6, 2016 at 12:02 PM, Robert Sanderson <azaroth42@gmail.com>
> wrote:
>
>
>> * Is the DOI the canonical identifier for the Annotation?
>>
>
> So I may be off base here, but I think there are perhaps two different
> senses of the word "canonical" at play here.
>
> From an annotation systems perspective, it seems unlikely that the DOI is
> ever going to be canonical in the sense that it becomes the *primary
> identifier* replacing the one we minted originally.
>

Then can someone please explain the value proposition if the DOI is not the
primary/canonical identifier? What's wrong with just using the good old
HTTP URI?

We'll want to use a consistent identifier for all our annotations
> internally, not different ones depending on whether a DOI was issued.
>  (What if someone captured the URL of the annotation *prior* to the DOI
> issuance?  We can't ourselves fail to resolve the "old" address of the
> annotation.)  I assume there may even be performance issues underlying
> this.  This is perhaps more true of annotation systems than regular
> publications because annotations would be born without DOIs and presumably
> get them later, and I'm not imagining that would change.  Otherwise we'd be
> issuing DOIs for every trivial annotation from inception, and that indeed
> would be massive.
>

The performance is definitely an issue, as with any short URL redirection
system.


Even assuming a DOI has been issued for an annotation, if someone else
> comes along and wants to tweet out the same annotation, and exposes our
> share dialog to get the link, are they going to care whether there is a
> DOI, and if there is one, is that the one that they necessarily want to
> use? (I'm presuming if a DOI was issued, we'd show both the original style,
> and also the DOI side by side) The tweeter doesn't really care about
> permanence, and they'd probably just opt for the link style they're
> familiar with (in our case, hyp.is/<TOKEN>).  That one will also be more
> performant since it doesn't have to go through a resolver first.
>

Indeed. DOIs are just a social construct, and if society doesn't see a need
for them, then forcing their use is a waste of everyone's time.



> I'll note that even for classic publishers, DOIs aren't even always
> canonical in the sense that the publisher themselves uses them internally.
> For instance, eLife doesn't use the DOI when they reference their articles,
> like this one:  http://elifesciences.org/content/5/e13273
>

Right. DOIs are a safeguard against publishers not being able to, indeed
not wanting to, claim that their URLs are persistent. They pay for the
perception that their content is stable by handing it off to the DOI
framework.  But in fact there are many, MANY, DOIs that no longer resolve
to anything.  They're not all scholarly or even valuable -- there are DOIs
for test pages, porn and whatever else that exists on the web.


If it isn't, then why mint one at all? To me, it defeats the purpose to
>> have a DOI if it's not the canonical identifier for the resource.  The
>> value of DOIs is when the publisher of the content changes, the citations
>> and references remain the same.
>>
>
Which brings me back to this question...

>From my perspective, no change is needed, but it would be good to discuss :)
>>
>
... and this perspective.

The community that needs them is very small, relatively speaking. The
chance of multiple communities doing the same thing differently is very
small. And the cost if that did happen is also practically zero. If you had
the DOI, then clients won't actually do anything with them.

All that said, the library community has already developed a model for
exactly this, across the many different types of identifier in use,
including DOIs.

  * http://id.loc.gov/ontologies/bibframe.html#c_Identifier
  * http://www.loc.gov/bibframe/docs/pdf/bf2-identifiers-apr2016.pdf

Rob

-- 
Rob Sanderson
Semantic Architect
The Getty Trust
Los Angeles, CA 90049
Received on Monday, 9 May 2016 15:54:28 UTC