Re: Dereferencing, Re: Jotting down some discussion topics from Marcos Caceres on 2016-09-21 (public-digipub-ig@w3.org from September 2016)

From: Marcos Caceres <marcos@marcosc.com>
Date: Wed, 21 Sep 2016 03:23:55 -0400
To: Peter Krautzberger <peter.krautzberger@mathjax.org>, Leonard Rosenthol <lrosenth@adobe.com>, Ivan Herman <ivan@w3.org>
Cc: Michael Smith <mike@w3.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CAAci2aD+uBggdXBeu+5_RVoEz8pHYMiq7gsKVAz4+FBBeymmyw@mail.gmail.com>

On September 21, 2016 at 4:30:17 PM, Leonard Rosenthol
(lrosenth@adobe.com) wrote:
> Also remember, Marcos, that the identifier for a PWP is _NOT_ always a URL.

I completely agree. Using URLs as identifiers is generally not a great
idea, because URLs are so volatile - and domains can be lost, swapped,
abandoned, deleted. And because of the "but what will it return?"
(dereferencing) problem, which is why I don't think we want to go
there... but here we are :)

Here is a real life example from one of my favorite books about HTML:

http://diveintohtml5.info/

There is a dramatic history around that book and the author (which I
won't go into, but it would make for a great book!), but it used to be
hosted at a different URL (the original author rage deleted the domain
along with all traces of their online persona).

The web dev community found a way to bring the book back to life
(thanks to its  CC-BY-3.0 license) and, IIRC, archive.org.

The book is also published in physical form as:
https://www.amazon.com/HTML5-Up-Running-Mark-Pilgrim/dp/0596806027

With identifiers:
ISBN-13: 978-0596806026
ISBN-10: 0596806027

Anyway, the point is... same book, different URL. URLs can't identify
things and when they do, they do it badly (e.g., XML namespaces).

> It could be
> w3id, a DOI or an ISBN. We need a term that works for all of those types of identifiers. (since
> we also have an “off the web” manifestation, that I know you hate).

I don't hate (sorry if I came across that way).

Because URLs are not stable, it's desirable to separate identifying
aspects from the protocol used in the acquisition of a publication.
That is, http(s) the protocol to acquire a resource that self
identifies by a w3id, DOI, ISBN or whatever - or in the container
case, container contains resource(s) that together form publication
identified by w3id, a DOI, or an ISBN.

Received on Wednesday, 21 September 2016 07:24:26 UTC