URIs don't force behavior [was: Why are relative NS identifiers used?] from Tim Berners-Lee on 2000-05-20 (xml-uri@w3.org from May 2000)

From: Tim Berners-Lee <timbl@w3.org>
Date: Sat, 20 May 2000 03:45:01 -0400
To: <keshlam@us.ibm.com>, <xml-uri@w3.org>
Message-ID: <000001bfc258$88455920$58a55c8b@ridge.w3.org>
-----Original Message-----
From: keshlam@us.ibm.com <keshlam@us.ibm.com>
To: xml-uri@w3.org <xml-uri@w3.org>
Date: Friday, May 19, 2000 10:50 AM
Subject: Re: Why are relative NS identifiers used?


>>For example, it might be in order to look up the namespace
>>(by its URI) in an RDF database and determine its "xmlSchema" property.
>>That can't be done unless the namespace is a resource named by a true URI,
>>because RDF is about properties of resources.
>
>I'm sorry about sounding like a scratched record (now there's a metaphor
>that's dying out!) ... but please consider my xmlns-binding: proposal. It
>is possible to define a way for a namespace to be associated with a true
>URI without the namespace name having to _be_ that URI.


It seems that people will go to any lengths to avoid namespace names
actually _being_ a URI.    But being a uRI is where the power and reuse is.


And it seems if I understand it correctly (or at least vaguely) that some
of the Microsoft cases of the use of namespaces is that a database
is exposed to the web as a set of virural pages including a virtual schema.
All the pages are produced in a mutally referential set,
with relative URIs between them. This includes the namespace URI
which points to the schema.  Simple, consistent,
and robust in that the generating code doesn't have to know what its own
host name is. This is a case of relative URIs being used in a natural way.

It fortunately doesn't break because in validation no one is comparing two
relative
URIs with different base addresses.  And of course it would work fine
if the validator is fixed to compare absolute URIs.

>I really think that's the point we're getting hung up on. Some folks are
>insisting that the namespace name itself must serve this role... and by
>assuming that specific solution are locking themselves into a set of
>semantics that conflicts with the needs of the Namespace spec.

I would like to explain the philosophy of the URI a little as I see it.
I am fed up with peoples suggesting that I insist that everyone download
schemata.

The namespace name  IDENTIFIES the namespace.  The namespace is
an abstract resource.  The declaration which relates the prefix to the
namespace
by quoting an identifier is a piece of information.

It is not an instruction.

If we could for a moment get out of the mindset of requiring that a
processor
should do this or that, and talk about information, this would be easier.

This simply fact of identification, a simple link from the digital
representation
of one resource (document) to another abstract resource (the namespace).

A processor can use that information as it likes.  There are certain things
it can deduce and certain things it can't.   You can deduce that it the
same as another resource with the same identifier.

So here are a few ideas for things your application could do.

1. Just compare it (the absolute URI) with other URIs for the purposes of
building a DOM tree

2. Compare it with a hardwired namespace name you know and stop if it is
anything else.

3. Decide to validate the document, which needs a schema.  You access the
document by URI.
The network access you have set up to run though a
local proxy on your machine which has a cache which contains things you have
explicitly asked it to
keep as well as things the proxy guesses you might need. (Such products
exist - you can do this now.
You do not need a hand-managed catalog).

4. Decide that you want to understand the meaning of the document as
intended by its publisher.
Search the web for a Java class which will support this namespace in your
current execution environment.
Go to the server and ask for a signed copy and a trust chain which will
convince you of its safety.
Load it and register it with the namespace handling dispatcher in your XML
software.
(I haven't heard of this existing)

5. Send an email around all your friends asking whether they know about the
namespace, and
whether it is a useful one, and whether they know any cool apps to handle it

6 And so on.

The point is that by using a URI you are not insisting on any behavior.
Are you insisting
on a fixed semantics, as Joe Keshlam worries?

Hardly.  The variety of URI schemes available covers a great many choices.

- uuid:   Globally unique, not dereferencable. Abstract

- mid: Globally unique, hierarchical delegation, traditionally indexed by
those who have seen the document. Corresponds to single message (i.e. no
versions, content negotiation, etc, time variance)

- md5: Globally unique, corresponds to precisely one  (ignoring unimaginable
coincidences) document. Verifiable, generatable, not dereferencable.

- http: Globally unique, hierarchical delegation, persistence and reuse
policy defined by publisher.  Generic abstract abstract may be represented
by many alternative representations depending on common languages between
publisher and client.

and so on.

The HTTP space is really quite flexible, as it can be used for example
simply as a name server.   The persistence properties can be defined by the
publisher.  HTTP for example is used for PURLs (Persistence URLs) at
purl.org run by OCLC.

The design is that the URI stands for an abstract resource.  In the case of
HTTP, the client can ask for a representation of the resource.  At the
moment, if the resource is a namespace, we have xHTML and xml-schema as two
useful languages for providing information about a resource.  There has been
some complaint recently that the W3C server used to give only  human
readable version and now gives only a schema. In the future,  more languages
will exist, and so it may be that those who care to (you are not forced!) to
go to the W3C server and ask about a namespace will get a document
containing syntactic and semantic information in many languages.

Because the W3C server is run by the publisher - owner - of the resource,
the information which is retrieved may be said to be definitive, in the
sense that it comes from the horse's mouth.  That does not mean that other
sources won't have lots more interesting and useful information about the
same abstract resource.

Once a concept has a URI, there is a huge open-ended set of things which can
be done with it.  The URI ties them all together.  To use different strings
for comparing it against other namespaces, loading an xml-schema, loading an
rdf-schema, loading a human readable specification, pointing to it from
annotations, or whatever, breaks this up.  It is an obstacle to the reuse of
the concept.

To give something a URI is not to force anything.  It is the minimum
requirement for that thing to be part of the web.  Much else follows.
Received on Saturday, 20 May 2000 08:38:12 UTC