Re: [urn] fragment identifiers from Juha Hakala on 2011-03-10 (uri@w3.org from March 2011)

From: Juha Hakala <juha.hakala@helsinki.fi>
Date: Thu, 10 Mar 2011 16:04:04 +0200
To: Julian Reschke <julian.reschke@gmx.de>
CC: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "uri@w3.org" <uri@w3.org>, urn@ietf.org
Message-ID: <4D78DA54.7070109@helsinki.fi>

Hello,

Julian Reschke wrote:
> On 10.03.2011 13:28, Juha Hakala wrote:
>> ...
>> Persistent identifiers will be used for multiple purposes, and by the
>> time we assign e.g. a URN to a resource, we have no idea which
>> resolution services will be needed in the (distant) future. Lifetime of
>> a PID may be centuries; applications and the functionality they offer
>> will change many times during such a period. And eventually even the
>> copyright protection of a document will expire ;-).
>> ...
> 
> I think that statement in itself rules out use of fragment identifiers. 
> At least if you want to stay in sync with the URI spec (RFC 3986).

Can you explain why this would be the case? Please see below why I find 
it difficult to agree.

>> Retrieving a representation is one the key resolution services supplied
>> already. But there does not need to be a 1:1 relation between a URN (or
>> any other persistent identifier) and the URI (URL/URLs) it maps to via a
>> resolution service.
>> ...
> 
> Even if there *was* a one-to-one mapping, the representation could still 
> vary based on request header fields (content negotiation), and also over 
> time.

In the future, the applications preserving and delivering past digital 
resources will usually be a long term preservation systems (such as Ex 
Libris' Rosetta), hosted by national libraries / national archives or 
other organisations which are legally obliged to store certain types 
documents (publications, radio and tv programs, government publications) 
for future generations.

Eventually, these systems will contain multiple versions of a resource, 
produced via migrations of successive versions of resource. Each version 
(or manifestation, as we call them) must be kept to make roll-back 
possible, and will have its own identifier that will never change. When 
a new version is made, it will get a new identifier, even if the new and 
old document have the same look and feel.

If a certain version of a resource has an internal structure, and the 
component parts have fragment level persistent identifiers, then those 
identifiers will remain functional for this particular version of the 
resource. Earlier and later versions may not have a similar structure, 
but if so, they will not have similar identifier architecture.

 From the national library's point of view I do accept the view that 
manifestations of works will change over time, but identifier - 
manifestation -links will not, at least in well managed digital archives 
and URN namespaces. A URN given to PDF version of Mr. Teppo Sarkamo's 
dissertation (http://urn.fi/URN:ISBN:978-952-10-6832-4) will never 
change. When a new version of the book is produced, it will get 
different URN:ISBN.

One may of course argue that most systems in which URNs are to be used 
will not be built in this manner and that therefore most identified 
resources will change in more or less subtle manner over time. My take 
on this is that different URN namespaces may / will have different 
policies, and this may have an impact on many things, including the 
usage of fragments. But there are namespaces where identifying fragments 
may make sense, also when done using the URI <fragment> functionality.

Juha
> 
> Best regards, Julian
> 

-- 

  Juha Hakala
  Senior advisor, standardisation and IT

  The National Library of Finland
  P.O.Box 15 (Unioninkatu 36, room 503), FIN-00014 Helsinki University
  Email juha.hakala@helsinki.fi, tel +358 50 382 7678

Received on Thursday, 10 March 2011 14:04:46 UTC