Re: Significant W3C Confusion over Namespace Meaning and Policy from Patrick Stickler on 2005-02-18 (www-tag@w3.org from February 2005)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Fri, 18 Feb 2005 09:59:58 +0200
To: "ext Elliotte Harold" <elharo@metalab.unc.edu>
Cc: paul.downey@bt.com, ext John Boyer <JBoyer@PureEdge.com>, "'ht@inf.ed.ac.uk'" <ht@inf.ed.ac.uk>, "Bullard, Claude L (Len)" <len.bullard@intergraph.com>, ext Harry Halpin <hhalpin@ibiblio.org>, derhoermi@gmx.net, www-tag@w3.org
Message-Id: <601e5457313342acde766426c251a034@nokia.com>
On Feb 17, 2005, at 13:33, ext Elliotte Harold wrote:

>
> Patrick Stickler wrote:
>
>> Rather RDDL, or rather, the idea of namespace documents, should just 
>> be  dropped.
>> Getting back a namespace document which identifies all of the 
>> versions  of all of
>> the vocabularies, schemas, ontologies, etc. employing terms from that 
>>  namespace
>> is useless.
>> How is an application to know *which* version of *which* model to
>> apply in order to interpret the term in question? It can't, because
>> a given namespace is not tied to a specific version of a specific
>> model -- insofar as the specifications are concerned
>
> You're so focuses on ontologies and models and machine understanding 
> that you've managed to completely miss the real point of RDDL.

I don't think so. See comments below.

> It has relatively little to do with machine understanding (though I 
> don't think it's quite as useless in that arena as you do), but even 
> if we grant that RDDL has no purpose for machine comprehension and 
> processing, it still serves two important purposes very nicely:
>
> 1. It keeps the error logs from filling up with 404s.
>
> 2. It lets *people* learn something about the namespace, including the 
> various versions of various vocabularies, schemas, ontologies, etc. 
> employing terms from that namespace.
>
> It's not all about machines. In fact, RDDL was invented primarily 
> because humans were having trouble with this stuff. Machine processing 
> was an afterthought.
>

I'm all for folks finding the information they need (heck, that's the
focus of most of my work -- the machines are just intermediary tools
for humans to accomplish more meaningful tasks).

My issues with RDDL and namespace documents are in the context of
web architecture. IMO namespace documents, and technologies such as
RDDL, can be useful tools to help with information discovery *BUT*
they should not be pushed as globally scalable parts of the
fundamental web architecture.

Namespace documents are not backwards compatible with existing
use of namespace names -- since one cannot impose a reinterpretation
of what those namespace name URIs identify such that they would
identify namespace documents.

Also, from a management perspective, it's not scalable. Each time
a new model or a new version of a model is defined which uses some
term, one would have to go back and modify the namespace document
for the namespace name of that term -- but what if the owner of
the model is not the same as the owner of the namespace document?
And for namespaces containing very widely used terms, the namespace
document could become unmanageably complex.

E.g. I use terms such as rdfs:label and dc:identifier in various
application specific models -- but I would not expect the W3C
or DCMI to update their namespace documents to refer to my models!

And my models may be asserting application-specific information
about those resources (e.g. presentation labels in some new language
or tighter range constraints which are application specific)
which is specific to the application (albeit compatible with the
assertions made by the term owners) and not expected to be taken as
relevant (or even valid) for all applications using those terms as
defined by the term owners. Yet that information is critical to
the proper, intended interpretation of data instances produced
per my specific models.

Namespace documents seem like a good idea in contexts where there
is a 1:1 relationship between namespace and model or namespace
and vocabulary, but it will not scale as would be needed for
a trully global solution.

The solution to the versioning problem (both for humans and machines)
is to focus on the models, not the terms, and to provide consistent
mechanisms for agents producing content to communicate the model(s)
which should be used to interpret that content -- and whatever solution
emerges should have a much stronger opportunity to become globally
ubiquitous than one which forces reinterpretation of existing URIs
or sparse consistency of practice (which is the case with namespace
documents).

E.g. something analogous to <?xml-stylesheet ...?> is an option:

<?xml-model href="http://some.org/some/model/some/version"?>

And yes, RDDL could be used for one of the representations
of such a specified model -- as well as other representations
expressed in e.g RDF, OWL, XML Schema, RELAX NG, etc. etc.

This should do just fine for XML, RDF, OWL, and any other XML
serialized application.

It also helps the "islands of XML" problem where different
XML languages are intermixed (e.g. XHTML+MathML+SVG) such
that the model identified constitutes a particular application
of those XML languages where the rules for conflict resolution
and other integration issues are defined for the model, and
applications supporting the model will then know how to
properly interpret the data instance.

If there emerges a consistent framework for identifying
conflicts of interpretation between multiple XML languages
in a single instance and resolution instructions/policies
can be expressed in a formal manner, then identifying the
model explicitly by URI also allows for publishing formally
expressed resolution instructions via that URI for XML
processors to employ when intepreting data instances
produced according to that model.

And far better for users (and machines) to dereference the URI
of the model which actually governs the interpretation of the data
instance than any number of namespace URIs which serve
merely as syntactic components of terms and force the user
or agent to guess which version(s) of which model(s) that
might be described by a namespace document (if they're lucky)
is most appropriate.

It also allows versioning to be easily (yes, easily) addressed
by simply using distinct URIs for each version of each model,
and then, yes, good practice would be for each owner of each
model to clearly document versioning policies for their model
and how backwards compatibility relates to version changes.

And to the recieving agent, it is crystal clear which version
of which model to employ to interpret the data instance,
and namespaces can be left at the syntax layer where they
belong -- and where they do a good job of avoiding naming
collisions.

So, in short, namespace documents and RDDL are IMO searching
for one's lost keys under the street light because that's where
one can see. It misses the essential points that (a) namespaces
really are just syntax, (b) from a management perspective,
trying to manage references to related resources in a single
document is highly impractical and non-scalable, and
(c) applications need to know the higher level models by which
to interpret data instances, and the one to many mapping from
term to model makes discovery via the term impractical at
best -- and certainly not a solution to considered part of the
fundamental web architecture.

Regards,

Patrick
Received on Friday, 18 February 2005 08:02:14 UTC