Re: determine root element in the xml from schema from noah_mendelsohn@us.ibm.com on 2004-03-09 (xmlschema-dev@w3.org from March 2004)

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 8 Mar 2004 19:06:19 -0500
To: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
Cc: "'Lingzhi Zhang'" <lzhang@cse.ogi.edu>, Michael Kay <mhk@mhk.me.uk>, Mik Lernout <mik@futurestreet.org>, "'dev xmlschema'" <xmlschema-dev@w3.org>
Message-ID: <OFF993A876.2D8AC68C-ON85256E51.0082C792@lotus.com>

I'd like to add one other thing to this discussion:

Even when you expect a complete document to have a predetermined root 
element, it seems very important to allow for incremental validation of 
portions of such a document.  So, even if we had an <xsd:root 
name="elementName"/>, or some such, we would still have had to allow a 
mode for processors in which they ignored the root designation and began 
validation with some other element or complex type. 

Once you get that far in the design space, you realize that processors 
need the ability to ignore the root designation in certain cases anyway 
(I.e. for the incremental or partial validations).  They question then 
becomes:  when the application does want to enforce a check of the root, 
should we have extra mechanism allowing specification in the schema 
language, or should we rely on the application to check the root itself 
(or pass the name into the schema processor, which boils down to the same 
thing?)

As Michael Sperberg-McQueen points out, most any application that consumes 
a particular vocabulary knows the root it's expecting perfectly well and 
in the alternate design that application is still going to have to tell 
the processor to enable root checking.  Passing the known name of the root 
into the processor, or checking it in the application, seems trivial. 
Thus, the only applications we would help with the <xsd:root> mechanism 
would be generalized containers, such as document managers, which would 
then know that they were indeed storing whole documents and not fragments. 
 We made the decision that adding mechanism to deal with this use case 
didn't quite make an 80/20 cut, given the resulting complexity of having 
to run a processor in two modes, one of which would ignore the new 
mechanism anyway.  I continue to see that as a close call, and certainly 
not the sort of big security flaw implied in this thread.  It is true that 
we have lost the ability to use schemas in a standard way to ensure that 
document management containers are storing only whole documents and not 
fragments.  Then again, we've gained the ability for query and other 
systems to validate elements out of context.  I can easily live with this 
tradeoff.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Monday, 8 March 2004 19:07:55 UTC