RE: updating RFC 2718 (Guidelines for new URL schemes) from Hammond, Tony on 2004-09-06 (uri@w3.org from September 2004)

From: Hammond, Tony <T.Hammond@nature.com>
Date: Mon, 6 Sep 2004 11:06:48 +0100
To: 'Martin Duerst' <duerst@w3.org>, Larry Masinter <LMM@acm.org>, uri@w3.org
Cc: djz@corp.webtv.net, rpetke@wcom.net, 'Harald Tveit Alvestrand' <harald@alvestrand.no>, Tony Hansen <tony@att.com>, 'Paul Hoffman / IMC' <phoffman@imc.org>
Message-ID: <125F7834E11A5741A7D79412EE3504F90CE55882@UK1APPS2.nature.com>
> Yes. But URI schemes only describe the part before the "#". 
> Fragment identifiers are independent of URI schemes.
> 
> In other terms, the syntax that an URI scheme has to define 
> has to be a subset of absolute-URI in rfc2396bis.

I don't understand the basis for this assertion (that fragments are
independent of URI schemes). Just because schemes typically don't mention
fragements doesn't necessarily lead to this interpretation. My reading of
2396bis is that fragments are quite clearly integral components of URIs. I
note for example that URN expressly eschews the use of '#'. OTOH RFC2616 (as
far as I can see) entirely neglects to mention fragments in its URI syntax -
whoops! Anyway the actual wording in -bis is

	"Fragment identifier semantics are independent of the URI scheme and
thus cannot be redefined by scheme specifications."

It is thus the semantics _not_ the syntax that is scheme-independent. And I
believe that the original argument (re use or non-use of '#' in URIs) was
directed at a syntcatic consideration.

One might also want to revisit some of the language in -bis, #3.5:
"representations that _might_ result from a retrieval", "media type
[RFC2046] of a _potentially_ retrieved representation", "use of a fragment
identifier component does _not_ imply that a retrieval action will take
place" [my emphases]. All sounds a bit iffy to me.


> > > - We also should recommend that any components of the generic
> > >    syntax (e.g. // for top level, / for hierarchy,...) are used
> > >    with the semantics defined in RFC 2396bis.
> >
> >@@@ I am also not sure that 2396bis goes so far as to 
> actually require
> 
> It does not. That's why I used 'recommend' above, not 'require'.

You are right - my wrong. :)


> >"/"
> >to be a hierarchical delimiter for a given scheme, but it 
> rather says 
> >that URI schemes that do not want to remain opaque must support 
> >hierarchical processing. (I presume the "data" URI scheme would also 
> >support hierarchical
> >processing.:)
> 
> I think we will have to work on the details of the wording 
> here, to get it right.

Why? What is wrong with the present language? (And who we?) I still make the
point about the 'data' URI scheme - a very valuable and underused scheme
IMO.


> > > >I think it's useful if schemes are clear about whether
> > > >(or under what circumstance) the 'resource' might be 
> something that 
> > > >returns a (body/entity/...?) which has a Media Type, and can be 
> > > >used with fragment identifiers in their conventional definition.
> >
> >@@@ What is the media type of a non-dereferenceable URI, as might be 
> >minted in a typical RDF application? This all seems very 
> metaphysical 
> >to me. Or does the media type only come into play if a URI is 
> >dereferenced?
> 
> The RDF spec has some language somewhere that says something 
> like that a fragment id is to be interpreted as if the entity 
> was of type application/rdf+xml in case no entity can be 
> retrieved. That doesn't really mean too much, it just means 
> "if you don't have a media type to figure out what a fragment 
> identifier means, it means whatever it means". Metaphysical 
> if you want to call it that, or practical "nothing known in 
> particular", if you prefer that.
> 
> In the general case, I think this is quite untread territory.

I only used RDF for illustrative purposes. I hold by my comments above
regarding the metaphysics of non-dereferenced fragments.


> > > >2.3 Demonstrated utility
> > > >
> > > >I'd like to suggest that we require something stronger: that new 
> > > >URI schemes have demonstratable, new, long-lived
> > > >utility:
> > > >
> > > >   Because URI schemes are a single, global namespace, the
> > > >   unrestricted registration of many new URI schemes can
> > > >   clutter implementation space, and possibly lead to
> > > >   contention for "short names". For this reason, new
> > > >   URI schemes should have a clear utility to the broad
> > > >   Internet community, and provide some means of identifying
> > > >   resources that is not already available with previously
> > > >   registered URI schemes.
> > > >
> > > >Perhaps this is controversial :)
> > >
> > > It seems to go into the opposite direction of what was 
> discussed at 
> > > the BOF, namely to relax the rules. But I guess this 
> could work out 
> > > by saying that the above is desirable, and there should 
> be potential 
> > > for it, rather than having it as a hard-and-fast rule.
> >
> >@@@ Would have to agree with this last comment. I see no 
> problem with 
> >cluttering of implementation space. An implementation should 
> be able to 
> >detect a string being used within a URI context, parse it out to see 
> >that it conforms to the generic URI syntax, and then 
> discover whether 
> >it has any rules to dereference such a URI. Of course, most 
> >implementations do not recognize generic URIs but only 
> specific schemes 
> >which are hard coded into them. I am not a little sceptical of there 
> >being some kind of gold rush on URI scheme names.
> 
> Do you mean "not even a little sceptical", or "not just a 
> little, but a lot sceptical"? Or in other words, do you think 
> we will have a gold rush, or we won't?

My main point here is that it would be wonderful if web apps/toolkits/etc
could actually recognize a generic URI in the wild. Then figure out if they
could do anything further with it. Already the recognition of a global
identifier is of significant importance in according identity and comparing
with other identifiers (which, yes, means canonicalization, etc.) in
description languages.

On the secondary question of any possible 'gold rush', I'm not that
convinced that there would necessarily be one, especially given the grief of
trying to steer a new URI scheme registration though the IETF process.

Cheers,

Tony




********************************************************************************
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage mechanism. Neither Macmillan Publishers Limited nor any of its agents accept liability for any statements made which are clearly the sender's own and not expressly made on behalf of Macmillan Publishers Limited or one of its agents. Please note that neither Macmillan Publishers Limited nor any of its agents accept any responsibility for viruses that may be contained in this e-mail or its attachments and it is your responsibility to scan the e-mail and attachments (if any). No contracts may be concluded on behalf of Macmillan Publishers Limited or its agents by means of e-mail communication. Macmillan Publishers Limited Registered in England and Wales with registered number 785998 Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
********************************************************************************
Received on Monday, 6 September 2004 10:07:22 UTC