- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 15 Feb 2005 22:25:41 -0500
- To: "Dare Obasanjo" <dareo@microsoft.com>
- Cc: "ext Norman Walsh" <Norman.Walsh@Sun.COM>, "Patrick Stickler" <patrick.stickler@nokia.com>, www-tag@w3.org
Dare Obasnjo asks: >> So how does one identify a vocabulary? This is a very important open problem, I think. I also think that the answers only occasionally have to do with namespaces. My intuition is that we can start seeing the dim outline of an answer in the distinction that W3C XML schema makes between schemas [1] and schema documents [2]. Schema documents describe at most one namespace, and are what people often think of when they consider the schema language. They serve the same role for XML schema that Java source files server for the Java source language. Note that a .java file contributes to at most one Java package. A schema [1] is the collection of definitions used for validating an instance. It is completely flat with respect to namespaces, and it effectively defines (or bounds) the legal instances of a vocabulary. The element and attribute declarations employed are each separately labeled with the namespace of which they are a part. The analogy with Java is again moderately good: while you can collect your source and class definitions into packages for a variety of good reasons, once a program runs they all participate symmetrically; the link to base types or methods across packages is the same as within a package. The rough analogies are (regarding namespaces, Java, and Schema): Namespace<=>Package<=>Schema Doc (all are packages of related definitions for use in larger structures) Vocabulary<=>Java Program<=>Schema (the net assembly used for a particular purpose) As you know Dare, some of the versioning proposals I've made [3] are based on this notion of vocabulary; they discuss evolution of schemas as distinct from schema documents. There is still a big piece of the puzzle missing: right now a schema is an artifact that can often be inferred from a suitable collection of Web Resources (schema docs) and sometimes from other additional information (you don't have to put your schema definitions in a schema document any more than a Java class loader has to get its class definitions from the typical .class files). What we don't have is a good way to assign a single URI that would properly make a web resource for each XML schema. This would be like creating a URI for each whole Java program, regardless of which dynamic linking policies were used to resolve inter-class references. It's much easier to make URI's for the pieces than for the whole, unless the whole is statically linked (it isn't). The Schema WG has been aware of this problem for a long time. It's been actively studied in the course of our work on Schema Component Identifiers. Many of us think that some sort of RDDL-like collection document may be the answer, but I don't know anyone who yet think they have the details right. When we get such a thing, I suspect it will be a significant step toward answering your question: how does one identify a vocabulary? My guess is that the answer will be: with a lot of documentation, some important piece of which may be in the form of an XML Schema (or RelaxNG Schema or whatever). Noah [1] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#key-schema [2] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#key-schemaDoc [3] http://lists.w3.org/Archives/Public/www-tag/2004Aug/0010.html -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Wednesday, 16 February 2005 03:28:50 UTC