RE: Dereferencing, Re: Jotting down some discussion topics from Bill Kasdorf on 2016-09-21 (public-digipub-ig@w3.org from September 2016)

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Wed, 21 Sep 2016 09:15:54 +0000
To: Marcos Caceres <marcos@marcosc.com>, Peter Krautzberger <peter.krautzberger@mathjax.org>, Leonard Rosenthol <lrosenth@adobe.com>, Ivan Herman <ivan@w3.org>
CC: Michael Smith <mike@w3.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CY1PR0601MB14228FDB9D798CE8FA7A8946DFF60@CY1PR0601MB1422.namprd06.prod.outlook.>

Just a reminder that in what was originally the "Identifiers TF" in DPUB and which morphed to the "Locators TF" in DPUB, we quickly came to the conclusion that our task with (at that time) PWP and now (arguably) with WP was not to attempt to mandate a certain identifier; it was to provide a locator which could lead, through whatever means/routes necessary, to the essential content of the publication.

I still think that's our task.

Bill Kasdorf
VP and Principal Consultant | Apex CoVantage
p: 734-904-6252  m:   734-904-6252
ISNI: http://isni.org/isni/0000000116490786

ORCiD: https://orcid.org/0000-0001-7002-4786

-----Original Message-----
From: Marcos Caceres [mailto:marcos@marcosc.com] 
Sent: Wednesday, September 21, 2016 3:24 AM
To: Peter Krautzberger; Leonard Rosenthol; Ivan Herman
Cc: Michael Smith; W3C Digital Publishing IG
Subject: Re: Dereferencing, Re: Jotting down some discussion topics

On September 21, 2016 at 4:30:17 PM, Leonard Rosenthol
(lrosenth@adobe.com) wrote:
> Also remember, Marcos, that the identifier for a PWP is _NOT_ always a URL.

I completely agree. Using URLs as identifiers is generally not a great idea, because URLs are so volatile - and domains can be lost, swapped, abandoned, deleted. And because of the "but what will it return?"
(dereferencing) problem, which is why I don't think we want to go there... but here we are :)

Here is a real life example from one of my favorite books about HTML:

http://diveintohtml5.info/

There is a dramatic history around that book and the author (which I won't go into, but it would make for a great book!), but it used to be hosted at a different URL (the original author rage deleted the domain along with all traces of their online persona).

The web dev community found a way to bring the book back to life (thanks to its  CC-BY-3.0 license) and, IIRC, archive.org.

The book is also published in physical form as:
https://www.amazon.com/HTML5-Up-Running-Mark-Pilgrim/dp/0596806027

With identifiers:
ISBN-13: 978-0596806026
ISBN-10: 0596806027

Anyway, the point is... same book, different URL. URLs can't identify things and when they do, they do it badly (e.g., XML namespaces).

> It could be
> w3id, a DOI or an ISBN. We need a term that works for all of those 
> types of identifiers. (since we also have an “off the web” manifestation, that I know you hate).

I don't hate (sorry if I came across that way).

Because URLs are not stable, it's desirable to separate identifying aspects from the protocol used in the acquisition of a publication.
That is, http(s) the protocol to acquire a resource that self identifies by a w3id, DOI, ISBN or whatever - or in the container case, container contains resource(s) that together form publication identified by w3id, a DOI, or an ISBN.

Received on Wednesday, 21 September 2016 09:16:27 UTC