RE: XSD feature check-lists from noah_mendelsohn@us.ibm.com on 2005-08-04 (xmlschema-dev@w3.org from August 2005)

From: <noah_mendelsohn@us.ibm.com>
Date: Thu, 4 Aug 2005 11:45:44 -0400
To: "Michael Kay" <mike@saxonica.com>
Cc: "'Pete Cordell'" <petexmldev@tech-know-ware.com>, xmlschema-dev@w3.org
Message-ID: <OFF6412390.FC34C6E2-ON85257053.0054C77E-85257053.0056B71A@lotus.com>
Pete Cordell writes:

> While I see your point, I feel that XML schema is
> about 10 to 20 times more complex than an XML
> parser (a rough estimate based on implementation
> experience). 

I'm not defending the complexity of XML Schema.  A number of features were 
put in that I thought were intended as experimental during the CR period, 
and we were so far behind schedule that time was never mind to trim it 
down (my personal opinion only.)  It's more complex and thus more 
expensive to implement than it should be.  The spec. is ultimately quite 
precise on most points, but takes a lot of energy and patience to learn to 
read (unfortunate). 

> Therefore, it isn't necessarily
> appropriate to extrapolate what has 
> worked well for XML to XML schema.

I'm not worried so much about the implementers as the users.  XML Schema 
is no more complex to implement well than languages like, say, Java. Would 
you feel good if JavaSoft promoted a checklist so that vendors could say, 
I don't support:

        _ for loops
        _ interfaces
        _ casts
        _ protected

etc.?  There would be pandemonium.  Nobody could write a  Java program 
that would work in more than one place.  Of course, in a limited community 
of those who are part way along in their implementations, such lists are 
very useful for interop testing.  They are the last thing that users want 
to see vendors promoting in commercial products. 

It's not that schema is as simple as XML, it's that the goal is the same: 
both are used for interoperation across dynamically changing sets of 
organizations using whatever software those organizations use in the 
moment.  For that to work, each schema must mean the same thing wherever 
its used.  We have made a few (controversial) exceptions mostly having to 
do with latitude as to where schema documents are located, and implicitly, 
which ones are trusted.  Otherwise, a given schema must mean the same 
thing in all implementations, just as the same Java program must mean the 
same thing in all conforming Java VMs.   I think that's very important.

Mike Kay writes:

> I think there are a number of processors 
> that are very close to complete
> conformance, 

which suggests to me that it's now reasonable to expect such nearly 
complete conformance, at least from implementations marketed on a large 
scale by major suppliers (commercial or open source) for general purpose 
use.  No doubt, if the spec were simpler, there would be more such 
implementations built by smaller teams with less funding, and that would 
have been a very good thing.  I have very little sympathy for the well 
funded organizations that do schema as their day jobs and who, after a 
number of years, are failing to provide the levels of conformance that 
Mike and others are showing to be practical.  As Mike says, a fairly high 
degree of conformance is now being shown to be practical through multiple 
implementations in multiple organizations. 

It's not the W3C's business to bless or diss particular implementations, 
but I think it would be really good if the user community could find ways 
of publicizing which implementations are complete and correct and which 
aren't.  I'm sure the W3C will be glad to help arbitrate factual questions 
as to what behavior is correct.  I think the energy spent in that 
direction will be far more beneficial than energy spent providing standard 
documentation for the errors in broken implementations.  As Mike says, the 
remaining issues in the more respectable implementations tend to be rather 
tricky (though sometimes significant) edge cases that if listed would 
result in a table that few novices would understand. 

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Thursday, 4 August 2005 15:46:03 UTC