RE: What is the URI of Truth? from Patrick.Stickler@nokia.com on 2001-06-08 (www-rdf-interest@w3.org from June 2001)

From: <Patrick.Stickler@nokia.com>
Date: Fri, 8 Jun 2001 14:01:48 +0300
To: champin@bat710.univ-lyon1.fr, www-rdf-interest@w3.org
Message-ID: <6D1A8E7871B9D211B3B00008C7490AA50795875D@treis03nok>
> There are 3 common proposition for identifying Truth
>  (a) http://mydomain.com/path/Truth
>  (b) http://mydomain.com/path#Truth
>
> ...
> According to [1], "An http: URI (without fragment identifier)
> necessarily identifies a generic document".
> Hence this makes proposition (a) incompatible with the semantics of
> http: URIs.
>
> ...
> 
> A solution to that would be to improve the syntax of URI reference,
> so as to be able to constrain the resource. E.g. (just an instant
> suggestion)
>  
>  (b') http://mydomain.com/path#[type=text/rdf]Truth
> 
> Note that this works well with RDF fragments *only*.
> An HTML fragment does not, AFAIK, identifies an abstract thing, but a
> point in an HTML document.
> An XPATH fragment identifies, AFAIK, a well-formed piece of an XML
> document.

The main problem with propositions (a) and (b) is not whether
some syntactic tricks can help improve upon the problem but
that there is IMO a clear and essential property of URLs,
that they define a location (or access mechanism) by which
to obtain content, which makes them unsuitable, for philosophical
(and practical) reasons as universal identifiers/names.

One name can be mapped to many locations, each location possibly
associated with a particular significance. To blur that important
distinction is to invite confusion in the minds of less clever
users who do not grok the SW as well as we might -- as examplified
by the commonly uttered complaint "Why can't I get anything from
this &$*@*#$ namespace URL?!"

A namespace is an abstract resource. Period. It might have definition
in various concrete resources, accessible via URLs, but those are
not the namespace. A namespace should be identified *only* with
a URN. IMMHO. To use a URL as a namespace URI is to abuse not only
the intended semantics of the URL URI scheme but to abuse the 
users (both content producers and consumers) who use the SW.

>  (c) other_scheme:path/Truth
> ...

I.e., a URN. Exactly. Abstract resources need names, not locations. A
URN is a name. Even though it *might* be mapped to some location,
or resolved by some agent to a MIME stream, does *not* make it a URL.
In that regard, the intersection shown between URL and URN in the
illustration in RFC 2396 depending on whether a URN "acts" as
a URL is incorrect and misleading. Again IMMHO ;-)
 
> We still have proposition (c) which still seems the cleanest one :
> why bother with http: URIs and fragments when RFC 2396 provides us
> with an extensible mechanism for URIs? Let's design a specific URI
> scheme.

Having a single URN scheme for defining names of abstract resources
would be a definite advantage, and I hope to see such a scheme
come into being and gain popular use in the SW, but that still does
not solve the fundamental problem at the root of the URI vs URI ref
problem -- and that is that there is no formally defined mapping
from namespace URI + QName to full URI of the named resource within
the scope of that namespace. That mapping, regardless of what kind
of URI is used for the namespace, URL or URN, is dependent on either
the syntax of the URI scheme and/or the fragment syntax of a
(possibly arbitrary) MIME content type.

So long as namespace names can be any arbitrary URI, this mapping
problem will be exceptionally complex to solve.

One possible solution is to (a) define a single URN scheme that
has a well defined syntax suitable for all anticipated forms
of abstract taxonomies (i.e. "Name Spaces"), and (b) require that 
namespace URIs be instances of that specific URN scheme and only 
that URN scheme. I.e. a "namespace" becomes a URI scheme.

Then, regardless of context, there would be a known, explicit,
and reliable mapping of QName to full URI. *and* it would be
clear that the abstract resource was indicated and not either
its definition in some schema or its occurrence in some RDF
instance, etc. which would be indicated with URI refs.

Unfortunately, I don't see the imposition of a single URN scheme
for namespace-related vocabularies and taxonomies as very likely
to succeed, not unless there is very broad adoption and proactive
effort on the part of the W3C, IETF, and several other key
organizations (but hey, it could happen...).

The only other alternative is to define a mechanism (a schema)
by which the mapping from QName to full URI for each URI scheme
is defined, to be used by all parsers which much perform that
mapping. Cumbersome, but nevertheless *manditory* for consistent
identity of resources between disparate systems that use
namespace-based XML serialization for interchange.

> Many people in the RDF community are not fond of proposition (c).
> Their point is : it is very convenient to put some data on the HTTP
> space,
> and people willing to publish a new URI will rather use http: URIs --
> hence proposition (b).

I think the reason that people use HTTP URIs as names is because
they've become so saturated with HTML and HTTP URLs that they
equate URI = http://...  Even though folks see from time to time
a mailto: or ftp: URL, they still think that URI = URL and 
URL = http://...  -- and then once a few folks have used URLs as
universal names of abstract resources, others do simply by (ignorantly)
following their example. After all, if "..." at the W3C is doing
it, it must be right, eh?

People in general aren't used to thinking in terms of URNs because
there are so few of them in relation to HTTP URLs.

It's a matter of education. Far more folks in the world crap in a hole
than in a toilet, so why do we use a toilet? Bad habits, however,
pervasive, are still bad habits! It's up to those who know better
to set the example, not adopt the same bad habits and perpetuate
them. No?

> That does not bother me to much... as long as 
> - the fragment bug is solved, and they do use an *RDF 
> fragment* to do so
> - there is a way to distinguish between the concept Truth in the RDF
> version *and*
>    the paragraph about Truth in the HTML version, using the 
> constraining
> mechanism sketched above.

The QName to full URI problem will be solved when there is a 
consistent and explicit mapping from Qname to full URI for each URI
scheme used as a namespace, and/or a single URI scheme used
for namespace names.

The fragment problem will be solved when URI refs are not used
to identify abstract resources and when folks stop abusing
URL semantics by forcing URLs and URI refs to serve as universal
names of abstract resources.

And both of the above problems should bother everyone alot! It
surely bothers me (in case that wasn't clear from my posts ;-)
 
> My questions are:
> - does the constraining mechanism of (b') look good?

It still follows the "bad habit" of using a URI ref to identify
an abstract resource, where that fragment is dependent on a MIME
content type (specified or not per your constraining mechanism)
when neither the namespace nor abstract concept have anything to
do with any particilar MIME content type. It's using a hammer
to drive in a screw. Changing the shape of the hammer doesn't
make the practice any more "correct" (though a heavier hammer
might give that illusion by driving the screw in faster ;-)

Cheers,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Friday, 8 June 2001 07:02:03 UTC