Having your strings and identity too.... from Jonathan Robie on 2000-05-15 (xml-uri@w3.org from May 2000)

From: Jonathan Robie <Jonathan.Robie@SoftwareAG-USA.com>
Date: Mon, 15 May 2000 17:09:18 -0500
To: xml-uri@w3.org
Message-Id: <4.3.1.0.20000515161300.00c0a8e0@127.0.0.1>
Hi Tim,

As I understand it, you see namespace URIs are central to your concept of 
the Semantic Web.

I was fairly swamped during most of the discussion on the Plenary, so I am 
not certain whether my concerns are the same ones others in the XML 
activity have raised. I also do not think that my goals are completely at 
odds with what you are looking for. That said, here are some design 
criteria that I suspect may be important:

Desiderata: Fast and Simple
=====================

1. Finding an XML name must be fast. Whatever we decide, it must be 
possible to find the name of an element or an attribute without incurring 
network latency.

2. A name is just a name, and namespaces are used to disambiguate names. In 
programming languages that use namespaces, a namespace does not define 
semantics, it merely avoids name clashes. The Semantic Web and namespaces 
should be able to peacefully coexist, but they are not the same thing, and 
a name does not define semantics (though semantics can be associated with a 
name). Note that namespaces can exist in the absence of DTDs or Schemas, 
too - a name tells you neither the structure nor the semantics, though 
either can be defined using names that have been defined with namespaces. I 
think this is important because forcing a name to be a combination of a 
name and something else leads to designs that I find awkward and overloaded 
for too many purposes, and such designs tend to be neither fast (see #1) 
nor elegant.

3. There should be only one answer to the question, "What is the name of 
this element". Names should not change when I copy a file from one 
subdirectory to another. Suppose I make an identical copy of a file that 
uses relative URIs as namespace URIs. Does that mean that the names depend 
on the resolution of the relative URI, and the names in the two documents, 
which are byte-for-byte identical, are completely different? The current 
InfoSet draft seems to suggest that this is one legitimate way to interpret 
the names in these documents - one of two! Imagine the surprise of a user 
who issues a query against three different data sources containing 
identical data, and obtains different results depending on the physical 
location of the data source, and which of the two valid InfoSet 
interpretations is used. This does not seem to be tenable.


Keeping URIs Fast and Simple
======================

In your message, you say you want namespace URIs to be referencable, e.g. 
by XLink or RDF, allowing endorsements, digital signatures, semantic tips 
to allow external processing, etc. This raises two important questions:

Q1: Are relative URIs allowed?
Q2: Must URIs be resolved for system integrity; i.e., before comparing two 
names, must I resolve their namespace URIs to determine if they identify 
the same resource?

I believe that answering "yes" to either Q1 or Q2 leads to a violation of 
principles 1-3, meaning we must give up either the integrity of names or 
fast resolution of names.

In your post, you suggest that:

   There are those who would maintain that a
   namespace should have no semantics, but I
   would say that then documents will have no
   semantics, and will be useless to man or
   machine. [You can go through the philosophical
   process of defining all semantics in terms of
   syntactic operations, of course....]

I do not follow this argument. To me, the Semantic Web does not require 
that the association of semantics with names identifiable within a 
namespace be part of the namespace mechanism per se, as long as namespaces 
establish unique names which can then be used by systems that make 
assertions to build semantic networks. In fact, I believe that the creator 
of a namespace is unlikely to know the semantics that other people may 
later choose to apply to a name, so I think that we should keep names and 
semantics orthogonal. A name, by itself, has no semantics, but it does have 
identity, and this identity can be used to build systems that *do* have 
semantics.

You mention that "we all as engineers stand a chance of ending up in court 
giving testimony as to what an electronic transaction did or did not mean". 
That makes me suspect that I would like that to be carefully defined by a 
lawyer working with some business executives from various companies, not by 
me, though I might well define the namespace and the set of names to which 
they will refer while adding these semantics. Of course, our internal 
accountants might define a rather different set of semantics for the same 
namespace, and a governmental regulator may have a third set of semantics. 
Some parties may not wish to disclose their complete semantics to other 
parties using the same namespaces. Being able to use an absolute URI to 
look up the various semantics that are defined for a namespace seems to 
make sense, but I do not yet see the need for more complex solutions.


Jonathan
Received on Monday, 15 May 2000 17:07:30 UTC