Re: Major problem with schema needs immediate attention.

John,

Your use cases are exactly why in our implementation we have added 
extension attributes to define how a particular instance must be 
validated. We simply could not find acceptable that most instances in 
the model would be invalid. And as you point out, there is also a 
negative performance impact to start validating instances that do not 
require validation at all.

Clearly, in my opinion, things are very broken at the moment in this 
respect.

However I am not sure we should start specifying this in a way that is 
not compatible with what XSLT 2.0 does. What you are proposing below is 
a sort of pseudo-lax validation, and I may be wrong but I don't think 
that this is something that has been done by others.

The XSLT guys did all the thinking already, and I think we should 
leverage that thinking correctly or defer the issue.

-Erik

John Boyer wrote:
> 
> We have only three last call comments for which no resolution has been 
> made.
> One of them is a bit hard, and so we really need you guys to have a 
> close look at the issues in the next few days.
> 
> Please see issue 87 
> http://htmlwg.mn.aptest.com/cgi-bin/xforms-issues/Model?id=87;user=guest;statetype=1;upostype=-1;changetype=-1;restype=-1 
> 
> 
> I added some additional comments.
> 
> The problem happens because we say that we validate nodes against all 
> "applicable" schema declarations, but we do not rigorously define what 
> "applicable" means.  Many of us think we have an idea, but the idea I 
> have heard expressed does not actually work out very well in the context 
> of the XML schema algorithm itself, over which we have no control.  You 
> can get all the "applicable" schema declarations, to be sure, but the 
> only way to get them is to run validations that would invalidate most 
> forms that attempt to use schema.
> 
> The usual claim we make is that "all schema are available to every 
> instance" because any instance could use elements from the namespace of 
> any schema. But unfortunately, XML schema defines the fact that "strict" 
> is the default mode, and I see nothing that overrides that unless the 
> schema itself declares a lax or skip mode for a particular element OR 
> schema engines *may* lax validate an element's content after strict 
> validation has failed due to not finding an appropriate declaration for 
> the element.
> 
> So here is a trivial example:
> 
> schema targetnamespace="A"
> schema targetnamespace="B"
> 
> instance <e xmlns="A"/>
> instance <f xmlns="B"/>
> 
> If we claim that all schemas apply to all instances, then because schema 
> validation is strict, you will find that instance A is invalid by schema 
> B and instance B is invalid by schema A.  
> Hence, you will not be able to submit any data simply because you 
> decided to use two instances in the same form!
> In both cases, the errors occur because element declarations cannot be 
> found.
> Yet, this type of error cannot be ignored because we do expect that if 
> element e must have children A and B only, then you will get an error if 
> you put in some element C for which the schema has no declaration.
> 
> We hit an even simpler case in the field quite some time ago involving 
> just one schema!
> schema targetnamespace="A"
> instance <e xmlns="A"/>
> instance <f xmlns=""/>
> 
> The second instance is always invalid even though there are really no 
> schema that one might reasonably conclude are "applicable"
> 
> So, cards on the table time.
> 
> We handle this issue by defining "applicable" as follows:  the namespace 
> of the root element of the instance must match the target namespace of 
> the schema.  By only applying that one schema, the processor is faster, 
> but it also doesn't step on the above landmines.  We took the view that 
> an instance of the form <A:a><B:b/></A:a> would correspond to a schema 
> for A that *included* the schema for B.  I think the screw case there is 
> the soap envelope for a web service, but I can't remember for sure and 
> it's 2:30am right now, so I'll have to get back to you later on that.
> 
> Meanwhile, maybe the above solution is satisfactory or maybe it isn't. 
>  Please take the time to weigh in on this issue in the next day or two 
> so we can have the right kind of discussion on the list that will allow 
> us to close this on the next telecon.
> 
> Thank you,
> John M. Boyer, Ph.D.
> STSM: Lotus Forms Architect and Researcher
> Chair, W3C Forms Working Group
> Workplace, Portal and Collaboration Software
> IBM Victoria Software Lab
> E-Mail: boyerj@ca.ibm.com  
> 
> Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
> 


-- 
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/

Received on Thursday, 18 October 2007 19:02:09 UTC