Re: fragment identifiers from Martin J. Dürst on 2011-03-10 (uri@w3.org from March 2011)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Thu, 10 Mar 2011 19:14:12 +0900
To: Peter Saint-Andre <stpeter@stpeter.im>
CC: Andrew Newton <andy@hxr.us>, urn@ietf.org, "uri@w3.org" <uri@w3.org>
Message-ID: <4D78A474.5050002@it.aoyama.ac.jp>
Hello Peter,

I have cross-posted to the URI list, because I think it's important to 
get input from more experts. People on the URI list, this is about what 
to do (or not to do) about fragment identifiers in URNs, raised in the 
context of an update of RFC 2141.

On 2011/03/10 13:30, Peter Saint-Andre wrote:
> <hat type='individual'/>
>
> On 3/9/11 2:11 AM, "Martin J. Dürst" wrote:
>>
>> On 2011/03/09 13:51, Peter Saint-Andre wrote:

>> Anyway, from a higher-up view, RFC2141bis is defining the "urn:" URI
>> scheme, and URI scheme definitions in general are supposed to say
>> nothing (or just a little in some exceptional cases) on fragment
>> identifiers. The reason for this is that fragment identifiers are
>> defined per MIME Media Type, not per URI scheme.
>>
>> So if I have something like "urn:foo:bar:baz#here", then the urn spec
>> only has to say what "urn:foo:bar:baz" is supposed to mean, the meaning
>> of "here" is defined by whatever format I might get back when resolving
>> "urn:foo:bar:baz". If I have a browser that resolves (some) urns (I
>> don't know one, but there should be some), this is what already happens,
>> and it shouldn't and won't change. RFC2141bis doesn't have to say
>> anything for this to work.
>>
>> In case RFC2141bis tries to do anything else than the above, that would
>> be a very bad idea, and should be fixed quickly.
>
> Here is what RFC 3986 says:
>
>     The semantics of a fragment identifier are defined by the set of
>     representations that might result from a retrieval action on the
>     primary resource.  The fragment's format and resolution is therefore
>     dependent on the media type [RFC2046] of a potentially retrieved
>     representation, even though such a retrieval is only performed if the
>     URI is dereferenced.  If no such representation exists, then the
>     semantics of the fragment are considered unknown and are effectively
>     unconstrained.  Fragment identifier semantics are independent of the
>     URI scheme and thus cannot be redefined by scheme specifications.
>
> As far as I can see, the semantics of fragment identifiers in URNs would
> not be defined by media types because URNs are not generally resolved
> for the purpose of retrieving a representation.

"not generally" and "not" are not the same. Even for http: URIs, it's 
true that they are not always resolved. So in that sense, if I use
http://never_any_server_here.sw.it.aoyama.ac.jp/one/two/three
with some fragment identifier (I'm in control of sw.it.aoyama.ac.jp and 
make sure that there never is a server at 
never_any_server_here.sw.it.aoyama.ac.jp), then I'm indeed unconstrained.

On the other hand, for quite a few URNs, it would make a lot of sense to 
resolve them. Let's say I have set up some proxy or use some dedicated 
browser that helps me resolve some URNs. Then the paragraph from RFC 
3986 that you cite above clearly applies.

> Therefore, in the
> context of URNs, the semantics of the fragment would be considered
> unknown and would be effectively unconstrained (at least from the
> perspective of the 'urn:' URI scheme).

Non sequitur.

> 2141bis seems to imply that the semantics of the fragment identifier
> could be constrained by the definition of a particular URN namespace
> (despite the fact that they are not constrained by the 'urn:' URI scheme
> itself).

That would make at least some limited sense, if we could sort namespaces 
by whether they (maybe only occasionally) allow resolution, or whether 
they are absolutely and terminally never ever going to be used for 
resolution. But the last sentence from the paragraph you cite says:

                    Fragment identifier semantics are independent of the
    URI scheme and thus cannot be redefined by scheme specifications.

This not only means that the URN spec (which is just the definition of 
the 'urn:' URI scheme) cannot redefine fragment identifier semantics, it 
also seems to imply that scheme specifications (including the URN spec) 
cannot delegate such semantics to some subspaces of the scheme.

> I'm not sure what the use cases are here, but perhaps folks on
> the list could explain a bit more what they mean by reusing an
> identifier scheme that designates objects of such complexity that it is
> necessary to reference parts of the objects via fragment identifiers.

I'm looking forward to hear from other people on this list, but 
essentially even if there are very complex objects, there are always 
different ways to identify components than using a '#'.

Regards,   Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Thursday, 10 March 2011 10:15:16 UTC