RE: Question about xs:import from noah_mendelsohn@us.ibm.com on 2002-09-25 (xmlschema-dev@w3.org from September 2002)

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 25 Sep 2002 10:56:11 -0400
To: Mark Feblowitz <mfeblowitz@frictionless.com>
Cc: jeni@jenitennison.com, mrowell@openapplications.org, Ryan.Barr@ejgallo.com, xmlschema-dev@w3.org
Message-ID: <OF4E91E099.097EBAD2-ON85256C3F.00507397@lotus.com>

Mark Febrowitz writes:

>> I assume that there's a reason (rooted in 
>> theory) for import not to be treated as an 
>> include from another namespace. 

Well, I can point you in the right general direction, but this is surely a 
subtle area with many tradeoffs.  The schema WG spent months debating 
different approaches and priorities to the various use cases location and 
composition of schemas.  The bottom line is that we came down to one 
unifying principle:  the choice of specific schemas (and thus schema 
docments, which is the XML document form of a schema) is ALWAYs at the 
complete discretion of the application and or processor doing the 
validation.  In the end, validation is providing a service to that 
application. 

The most obvious case is is first understand why schemaLocation in an 
instance document is only a hint.  Consider an eCommerce scenario.  I'm 
validating your instance in part because I don't trust you.  I want to 
make sure you wrote a purchase order the way I want it.  What would it 
mean to trust you to choose the schema to use.

Similarly, though perhaps less obviously, you will be picking up schema 
documents for various namespaces from various places.  Some may suggest 
schemaLocations for other namespaces, and indeed they are often useful; 
that's why we provided the mechanism.  On the other hand, there are both 
trust and version control issues.  Maybe you are pulling together 5 schema 
documents all of which for one reason or another reference a schema for, 
say, HTML.  Maybe you have a version of the HTML schema that has a bug 
fix.  Should you have to edit those 5 other schemas?  Not necessarily. You 
can just tell your processor which version of the HTML schema to use.

So, what do you do if you really do want the schemaLocations honored? 
Simple:  make sure to use processors that allow you to say "always follow 
the hint, and if it doesn't resolve, fault."  If in some community with 
which you work you want that to be the rule you can say:  "The XXX 
standard uses XML schema, but requires that schemaLocation hints be 
followed."  Another model that I expect will be common will be:  "I will 
give the processor an expected list of schemaLocations.  Make sure that 
the documents and other schemas have schemaLocations that match my 
expectations:  if they do, we all are signalling that we agree on the 
schema to be used.  If not fail." 

So, the spec allows you and communities of your friends to use software 
that does what you want.  This is like saying that the C or Java 
programming languages allow you to agree on implementations that pull 
source or .class files out of Unix filesystem directories, which means you 
can send around tar images, but nothing prevents you from building 
language implementations that keep the source and class files in, say, a 
relational database.  So, we let our users set rules, but we don't require 
that they all choose the same rules.

Everyone agrees this is an area with many subtleties, and that any choice 
we made would involve compromises.  I personally remain convinced that in 
this area we have chosen a reasonable 80/20 point, on top of which one can 
implement the sorts of behavior that you are looking for.  Thank you.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------

Received on Wednesday, 25 September 2002 11:00:08 UTC