Re: from Ray Whitmer on 2000-06-02 (xml-uri@w3.org from June 2000)

From: Ray Whitmer <ray@xmission.com>
Date: Fri, 2 Jun 2000 08:49:48 -0600 (MDT)
To: Tim Berners-Lee <timbl@w3.org>
cc: "\"Clark C. Evans\"" <cce@clarkevans.com>, xml-uri@w3.org
Message-ID: <Pine.GSO.4.10.10006020749100.12633-100000@xmission.xmission.com>
On Thu, 1 Jun 2000, Tim Berners-Lee wrote:

> The people who worry about the stability of 
> HTTP names of course should have the same
> worry in principle about java class names.
> ("sun.com could forget to pay their $100 to NSI
> and lose the name and then someone else could
> get it and issue classes in com.sun").
> While I think persistence of domain names is 
> an issue, we don't need to trip up over it now in either case.

I think any credible supplier of software dependent upon 
such identities has to officially track all such
publicly-released identifiers for which identity can never 
be reused, and owners of a second-hand domain who are not 
willing or able to either avoid all such prior names,
whether by referring to a registry or by limiting
themselves to a scheme known to never produce collisions
should be considered not reliable.  Acquiring a polluted
namespace can require difficult cleanup.

There is only additional difficulty if you expect to be
able to actually retrieve contents at that URI, which is
never a problem of the java package naming, where the name
is clearly only an identifier and not a location.

> >The java package naming scheme also has the 
> >injective property... if someone changes the
> >package name, even slightly, people expect that,
> >in some manner, the code identified is different.
> 
> 
> That works because the java development world is
> small and tree-structured.
> 
> In fact, what happens when I make a play version of
> an existing java source tree?  I give it name 
> in my own space.  Most of the classes are in fact
> exactly the same in every way - I haven't modified them
> yet. But I will. Or I might. So the copies have difefrent names.
 
The devil here is in the details.  In the Java case, that
works quite well, because I only base my names upon identity,
not upon the location of the files.  Also, it is quite
likely that I will change some, but not all of the names
of a set (also, to make a play version, because the name
is an identifier rather than a location, it is often not 
necessary to change the name at all).

The problem arises in the XML namespaces case if I try to
use the location of the document as part of the name resolution.
There is no single unique location identifier.  The document 
exists simultaneously in a variety of mappings that may all give it
different functionally-equivalent names.  Different parts of my
system need to know when identity matches, yet may easily generate 
different URIs to access the same path, document, or entities 
within the document because the emphasis is on locating the 
document, not identifying it.  If the base URI of a document were 
about identity, it would be part of the document, not generated in 
some unspecified, nonstandard way by surrounding software.

Without resorting to relative paths, XML already posesses far
more elegant ways to remap parts.  For example:

<!ENTITY FOO 'urn://example.com/path/foo/'>
[...]
<element xmlns:foobar="&FOO;bar">

If I want to play with FOO-related classes, I change FOO.

This posesses none of the flaws of relative URI references in
namespaces.  It does not force the conflict and confusion between
content location and namespace identity.  It permits as many
separately-relative domains as you need, that can also be easily
structured relative to each other, so I can choose to localize 
the parts I want to play with while holding constant the ones I 
want the same.

Local namespaces merely need to be based upon a local file
identifier entity, call it DOCID.

I would argue that this is also a superior way to locate related
content resources in an XML document, recognizing that different 
types of files may be located using different schemes.  Where the 
resources involved are all content, such that identity is less
important than retrieval location, it does not hurt to additionally
be able to base things relative to the base location of the entity
itself, but in this case, entities are still extend that capability
well.

This also presents no real added difficulty for DOM.  Current versions
prevent entity changes during editing.  The proposed namespace support
leaves xmlns attributes in the tree, so that there is still a record
of whether they were composed of entity references, allowing the
serializer to preserve that.

In future contexts where entity modification is allowed, it is quite
likely that this cannot be done without regenrating the document
anyway.

Ray Whitmer
rayw@xmission.com
Received on Friday, 2 June 2000 10:49:51 UTC