RE: Significant W3C Confusion over Namespace Meaning and Policy from noah_mendelsohn@us.ibm.com on 2005-02-16 (www-tag@w3.org from February 2005)

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 15 Feb 2005 22:25:41 -0500
To: "Dare Obasanjo" <dareo@microsoft.com>
Cc: "ext Norman Walsh" <Norman.Walsh@Sun.COM>, "Patrick Stickler" <patrick.stickler@nokia.com>, www-tag@w3.org
Message-ID: <OFBEDA30E8.7607F7DF-ON85256FAA.00111F57@lotus.com>

Dare Obasnjo asks:

>> So how does one identify a vocabulary?

This is a very important open problem, I think.  I also think that the 
answers only occasionally have to do with namespaces.  My intuition is 
that we can start seeing the dim outline of an answer in the distinction 
that W3C XML schema makes between schemas [1]  and schema documents [2]. 

Schema documents describe at most one namespace, and are what people often 
think of when they consider the schema language.  They serve the same role 
for XML schema that Java source files server for the Java source language. 
 Note that a .java file contributes to at most one Java package.

A schema [1] is the collection of definitions used for validating an 
instance. It is completely flat with respect to namespaces, and it 
effectively defines (or bounds) the legal instances of a vocabulary.  The 
element and attribute declarations employed are each separately labeled 
with the namespace of which they are a part.  The analogy with Java is 
again moderately good:  while you can collect your source and class 
definitions into packages for a variety of good reasons, once a program 
runs they all participate symmetrically;  the link to base types or 
methods across packages is the same as within a package.

The rough analogies are (regarding namespaces, Java, and Schema):

Namespace<=>Package<=>Schema Doc  (all are packages of related definitions 
for use in larger structures)

Vocabulary<=>Java Program<=>Schema  (the net assembly used for a 
particular purpose)

As you know Dare, some of the versioning proposals I've made [3] are based 
on this notion of vocabulary;  they discuss evolution of schemas as 
distinct from schema documents.

There is still a big piece of the puzzle missing:  right now a schema is 
an artifact that can often be inferred from a suitable collection of Web 
Resources (schema docs) and sometimes from other additional information 
(you don't have to put your schema definitions in a schema document any 
more than a Java class loader has to get its class definitions from the 
typical .class files).  What we don't have is a good way to assign a 
single URI that would properly make a web resource for each XML schema. 
This would be like creating a URI for each whole Java program, regardless 
of which dynamic linking policies were used to resolve inter-class 
references.  It's much easier to make URI's for the pieces than for the 
whole, unless the whole is statically linked (it isn't).

The Schema WG has been aware of this problem for a long time.  It's been 
actively studied in the course of our work on Schema Component 
Identifiers.  Many of us think that some sort of RDDL-like collection 
document may be the answer, but I don't know anyone who yet think they 
have the details right.  When we get such a thing, I suspect it will be a 
significant step toward answering your question:  how does one identify a 
vocabulary?  My guess is that the answer will be:  with a lot of 
documentation, some important piece of which may be in the form of an XML 
Schema (or RelaxNG Schema or whatever).

Noah

[1] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#key-schema
[2] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#key-schemaDoc
[3] http://lists.w3.org/Archives/Public/www-tag/2004Aug/0010.html

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Wednesday, 16 February 2005 03:28:50 UTC