Re: QName URI Scheme Re-Visited, Revised, and Revealing

[Stephen Cranefield]

> Sean Palmer wrote:
> > I think you'll find that a) FragID syntax is independant of
> > URI scheme [...]
>
> I'm not convinced this is true.  The definition of a fragment
> identifier says (at http://www.ietf.org/rfc/rfc2396.txt):
>
>   When a URI reference is used to perform a retrieval action on the
>   identified resource, the optional fragment identifier, separated from
>   the URI by a crosshatch ("#") character, consists of additional
>   reference information to be interpreted by the user agent after the
>   retrieval action has been successfully completed.  As such, it is not
>   part of a URI, but is often used in conjunction with a URI.
>
> This specifically defines a fragment URI as information related
> to a retrieval action.  Therefore one could argue that it doesn't make
> sense to have a fragment identifier if the URI scheme is intended to
> denote names with no implied retrieval mechanism.
>

There's more of interest, too.  First, RFC 2396 excludes "#" from URIs:

   "The character "#" is excluded
   because it is used to delimit a URI from a fragment identifier in URI
   references (Section 4). The percent character "%" is excluded because
   it is used for the encoding of escaped characters."

But there is also something called a "URI Reference", that can contain "#"
and also relative path steps like ".." and ".".  Any relative steps are
supposed to be resolved before the uri reference is used.  From the rfc:

   The term "URI-reference" is used here to denote the common usage of a
   resource identifier.  A URI reference may be absolute or relative,
   and may have additional information attached in the form of a
   fragment identifier.  However, "the URI" that results from such a
   reference includes only the absolute URI after the fragment
   identifier (if any) is removed and after any relative URI is resolved
   to its absolute form.

Finally, the namespace Rec says:

    "XML namespaces provide a simple method for qualifying element
     and attribute names used in Extensible Markup Language documents
     by associating them with namespaces identified by URI references."

Now if the RDF Rec actually used URIs (as almost everyone writes), you
couldn't actually include "#" in the namespace, but actually the Rec uses
"uri reference", and we probably should too when we want to be precise.

With regard to Stephan's remarks about non-retrieval, rfc 2396 does not
actually define "retrieval", so far as I can see.  But it does define URI:

   "A Uniform Resource Identifier (URI) is a compact string of characters
   for identifying an abstract or physical resource."

It seems therefore that the term "retrieve" must cover an abstract
retrieval - for an abstract resource - and so the use of a "#" would be
perfectly acceptable according to this reading of the rfc.  After all, if
you get the intended information, isn't that a "retrieval"? You still got
the data you wanted even if you didn't go out on the network.

Cheers,

Tom P

Received on Friday, 24 August 2001 09:03:41 UTC