URI etymology from Pierre-Antoine CHAMPIN on 2001-06-11 (www-rdf-interest@w3.org from June 2001)

From: Pierre-Antoine CHAMPIN <champin@bat710.univ-lyon1.fr>
Date: 11 Jun 2001 16:05:48 +0200
To: Jonathan Borden <jborden@mediaone.net>
Cc: www-rdf-interest@w3.org
Message-Id: <992268351.513.4.camel@lisiperso3>
On 11 Jun 2001 06:31:52 -0400, Jonathan Borden wrote:
> Certainly an "http:" URI can identify a namespace, most namespace names in
> common usage are http: prefixed.
> (...)
> XML and XML applications see no "semantic ambiguity". The only ambiguity I
> see is that which is attached by humans trying to read the tea leaves.

This is only true because such URIs are used in a very precise contex:
xmlns* attributes.
Any string used in those attributes is supposed to identify nothing but
a namespace, so there is no ambiguity indeed !

The problem is that SW applications may talk about NS in other
unpredicetd contexts. I argue that outside a precise context, http: URIs
identifying NS are indeed ambiguous (see below).

> This distinction between URNs and URLs is
> somewhat historical. A URI is more that the mere encapsulation of URLs and
> URNs into a single specification: a URI prefixed with the http: scheme may
> be used as a name. Similarly URIs prefixed with the "urn:" scheme may be
> resolved.

Funny you mention that : I didn't use the distinction between URLs and
URNs !
I only talked about "http: URIs", which may or may not be URLs, I don't
care.

What I think is that URIs are not opaque names,
although http://www.w3.org/DesignIssues/Axioms.html reads

  Axiom: Opacity of URIs
  The only thing you can use an identifier for is to refer to an object.
When you are not dereferencing you   should not look at the contents of
the URI string to gain other information.

How would I know which URIs I may try to dereference, anyway, without
looking "inside" it ?
URIs have a rather precise syntax described in RDF2396. That syntax may
be used not only to know how to dereference a URI, but also *if* I may
dereference it, and what kind of resource it identifies.
This is what I call "URI etymology".

Here are 3 examples (where I identify myself with a SW agent) :

-  foo://bar/1/2/3
I can locate the scheme of this URI, which is "foo". I do not know that
scheme, so I can not infer anything from the structure (even if I know
that it respects the "hier_part" production rule of RFC2396).
Of course, additional metadata about that resource can give me more
information.

- urn:ietf:rfc:2396
I know the scheme "urn" : it is described in RFC2141, and is supposed to
be a unique persistent ID for a thing. I happen to know the subscheme
"ietf", so I know the identified thing is a document describing an
internet standard (restriction of the kind of resource identified). May
be I even know a resolution service for that subscheme
(http://ietf.org/urn? for example).

- http://www.w3.org/1999/xlink/
I know the scheme "http" : it means that
  o the URI describes a generic document
  o I may perform an HTTP GET request to the specified host, with the
specified path
  o if I get any data with that request, the data is an instance of the
generic document


The last exampes demonstrates the ambiguity I see in identifying
namespaces with http: URIs: a NS is *not* a generic document, and I can
get no instance of it whith HTTP. Hence the URI identifies 2 things,
which is wrong.

However, I can not see what is wrong with using URI etymology to get
metadata about resources (first of all, "what kind of resource is it
?").
If I'm wrong, what are the advantages of the Opacity axiom ?
If I'm right, then any URI can bnot be use to identify any resource...
including namespaces.

  Pierre-Antoine
Received on Monday, 11 June 2001 10:04:38 UTC