Re: [URN] URI documents -- "# fragment"

Roy T. Fielding (fielding@kiwi.ics.uci.edu)
Thu, 22 Jan 1998 21:33:57 -0800


To: uri@Bunyip.Com, urn-ietf@Bunyip.Com
In-reply-to: Your message of "Thu, 08 Jan 1998 16:57:39 EST."
             <01bd1c80$6ff48790$29019784@ssun.CNRI.Reston.Va.US> 
Date: Thu, 22 Jan 1998 21:33:57 -0800
From: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Message-ID:  <9801222147.aa14624@paris.ics.uci.edu>
Subject: Re: [URN] URI documents -- "# fragment" 

[reposted, since the uri lists were down last week]

Sam was saying:
>The point I wanted to show you is that "# fragment" doesn't work by itself.
>It's actually worked as a relative URL. And the generic URI parser may never
>get the "# fragment" alone. (ie, in your example, the <a href="#foo"> ... is
>a relative URL, not just a "# fragment".)

I seem to be having a hard time getting this point across.  The generic
URI parser *is* the thing that takes a string and does the handling
and interpretation needed to

   1) determine whether it is absolute or relative
   2) convert it to absolute form if needed
   3) give the resulting URI to the scheme-specific handler

There is no purpose for a generic URI syntax beyond that.  Likewise,
it is only that syntax which is needed by other protocols as a
Draft Standard reference.

>On the other hand, I don't see any usage of "# fragment" for "mailto" or
>"ldap" URLs as defined in the HTML document. So, if "# fragment" is not
>needed for all of the URI schemes, I wonder if we could drop it from the
>overall URI definition?

Because you cannot do so and produce an interoperable parser.

>Lastly, I'm wondering if the "# fragment" requirement is inherited from the
>earlier URL standards when there're few URL schemes defined. If we drop the
>requirement of "# fragment" from URI as a whole, it can still be defined by
>those URL schemes that need it, in their respective RFCs. And the only thing
>I see broken is that the generic URI parser can not catch the "#fragment",
>and decide what to do, which is not happening and I think really doesn't
>have to.

The "#fragment" is removed from the URI whether the URI is defined
to use it or not.  I cannot show you this using Netscape Navigator
because its parser is the only one I know of which is so hopelessly
broken that they use a fixed set of scheme names.  Other applications
allow the user to pass unknown URI schemes to a proxy for resolution,
and on those systems you will find that the "#fragment" is stripped
before being sent to the proxy.  It is therefore IMPOSSIBLE for "#"
to be used as anything else in the URI syntax and still retain
interoperability between new and deployed systems.

There is very little room for discussion of what is being defined by
the specification and in the syntax itself, since that is governed by
the most interoperable subset of what is implemented.  The only question
still to be determined is whether we call these things URI or URL,
and thus whether or not a URN should be referred to as a URI or a URL
when it is used by HTTP, HTML, XML, etc.

.....Roy