Re: Major problem with schema needs immediate attention.

Hi Erik,

Michael Sperberg-McQueen gave a talk at XML 2005 in which he did indeed 
describe the appropriateness of deciding to use different schema on the 
same data for different purposes. It is absolutely under the control of 
the application to decide whether and which schema to apply to *any* XML, 
and it has nothing at all to do with having a "pseudo-lax" schema method. 
What I believe we are talking about is how the application called XForms 
should decide whether and which schema to apply to pieces of XML that it 
manages.  Once the schema are selected, the validation is strict.

For my own part, I presented the method we currently use to work around 
the problem using a "common sense works pretty well but not perfectly" 
approach that does not happen to require extension attributes unsupported 
by the XForms schema. 

You've proposed that perhaps we should use attributes like those in XSLT 
2.0.  Based on what I've seen in your last call comment and in a brief 
look at Section 19 of XSLT2, it doesn't look appropriate for XForms 1.1. 
It seems like the XSLT2 attributes solve a different problem.  Here are 
more specific issues:

1) One problem is that XML Schema 1.0 does not mandate that 
implementations provide an execution mode other than strict, and I know at 
least one mainstream schema engine that could support the XSLT 2.0 
attributes as applied to XForms, which may not even have the same semantic 
as XSLT2 in any case. 

2) The validate attribute seems to hit the wrong problem, the modality of 
the schema engine as opposed to the applicability of schema to the 
instance data.  I have long felt that XForms 1.0 has the design flaw that 
the schema attribute is attached to the model and not to the instance 
element.  I think this may be left over from the good old days when a 
model only had one instance.  If we did go with a solution for XForms 1.1 
that added markup, I would rather see this:

<instance id="X" schema="X.xsd Y.xsd #inline-schema"> ...

If you don't include a schema attribute on an instance, then I think no 
schema should applicable to it. 
The schema attribute on instance would make the selection of schema for 
the instance direct and explicit, and it makes processing most efficient. 

3) I also didn't like the validation attribute because I didn't feel 
adding it to XForms is not really "futureproof".  We have long felt that 
we need to make the schema engine a pluggable component of XForms.  The 
validation attribute and its values are very XML Schema centric, i.e. they 
configure the processing model of the XML schema engine, so the attribute 
would be useful and possibly even confusing when another schema engine is 
being used.  By comparison, a schema attribute on instance is just a 
schema selector to indicate which schema are applicable to the instance, 
so it is schema engine neutral.

In conclusion, if we could settle on a method a little more like what I 
first proposed, it might help us to provide guidance for XForms 1.0 
processors today, not just XForms 1.1.  But if a new attribute is 
required, it seems to be schema and not validate that is needed.

Best regards,
John M. Boyer, Ph.D.
STSM: Lotus Forms Architect and Researcher
Chair, W3C Forms Working Group
Workplace, Portal and Collaboration Software
IBM Victoria Software Lab
E-Mail: boyerj@ca.ibm.com 

Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer





Erik Bruchez <ebruchez@orbeon.com> 
Sent by: public-forms-request@w3.org
10/18/2007 12:01 PM
Please respond to
ebruchez@orbeon.com


To
"Forms WG (new)" <public-forms@w3.org>
cc

Subject
Re: Major problem with schema needs immediate attention.







John,

Your use cases are exactly why in our implementation we have added 
extension attributes to define how a particular instance must be 
validated. We simply could not find acceptable that most instances in 
the model would be invalid. And as you point out, there is also a 
negative performance impact to start validating instances that do not 
require validation at all.

Clearly, in my opinion, things are very broken at the moment in this 
respect.

However I am not sure we should start specifying this in a way that is 
not compatible with what XSLT 2.0 does. What you are proposing below is 
a sort of pseudo-lax validation, and I may be wrong but I don't think 
that this is something that has been done by others.

The XSLT guys did all the thinking already, and I think we should 
leverage that thinking correctly or defer the issue.

-Erik

John Boyer wrote:
> 
> We have only three last call comments for which no resolution has been 
> made.
> One of them is a bit hard, and so we really need you guys to have a 
> close look at the issues in the next few days.
> 
> Please see issue 87 
> 
http://htmlwg.mn.aptest.com/cgi-bin/xforms-issues/Model?id=87;user=guest;statetype=1;upostype=-1;changetype=-1;restype=-1 

> 
> 
> I added some additional comments.
> 
> The problem happens because we say that we validate nodes against all 
> "applicable" schema declarations, but we do not rigorously define what 
> "applicable" means.  Many of us think we have an idea, but the idea I 
> have heard expressed does not actually work out very well in the context 

> of the XML schema algorithm itself, over which we have no control.  You 
> can get all the "applicable" schema declarations, to be sure, but the 
> only way to get them is to run validations that would invalidate most 
> forms that attempt to use schema.
> 
> The usual claim we make is that "all schema are available to every 
> instance" because any instance could use elements from the namespace of 
> any schema. But unfortunately, XML schema defines the fact that "strict" 

> is the default mode, and I see nothing that overrides that unless the 
> schema itself declares a lax or skip mode for a particular element OR 
> schema engines *may* lax validate an element's content after strict 
> validation has failed due to not finding an appropriate declaration for 
> the element.
> 
> So here is a trivial example:
> 
> schema targetnamespace="A"
> schema targetnamespace="B"
> 
> instance <e xmlns="A"/>
> instance <f xmlns="B"/>
> 
> If we claim that all schemas apply to all instances, then because schema 

> validation is strict, you will find that instance A is invalid by schema 

> B and instance B is invalid by schema A. 
> Hence, you will not be able to submit any data simply because you 
> decided to use two instances in the same form!
> In both cases, the errors occur because element declarations cannot be 
> found.
> Yet, this type of error cannot be ignored because we do expect that if 
> element e must have children A and B only, then you will get an error if 

> you put in some element C for which the schema has no declaration.
> 
> We hit an even simpler case in the field quite some time ago involving 
> just one schema!
> schema targetnamespace="A"
> instance <e xmlns="A"/>
> instance <f xmlns=""/>
> 
> The second instance is always invalid even though there are really no 
> schema that one might reasonably conclude are "applicable"
> 
> So, cards on the table time.
> 
> We handle this issue by defining "applicable" as follows:  the namespace 

> of the root element of the instance must match the target namespace of 
> the schema.  By only applying that one schema, the processor is faster, 
> but it also doesn't step on the above landmines.  We took the view that 
> an instance of the form <A:a><B:b/></A:a> would correspond to a schema 
> for A that *included* the schema for B.  I think the screw case there is 

> the soap envelope for a web service, but I can't remember for sure and 
> it's 2:30am right now, so I'll have to get back to you later on that.
> 
> Meanwhile, maybe the above solution is satisfactory or maybe it isn't. 
>  Please take the time to weigh in on this issue in the next day or two 
> so we can have the right kind of discussion on the list that will allow 
> us to close this on the next telecon.
> 
> Thank you,
> John M. Boyer, Ph.D.
> STSM: Lotus Forms Architect and Researcher
> Chair, W3C Forms Working Group
> Workplace, Portal and Collaboration Software
> IBM Victoria Software Lab
> E-Mail: boyerj@ca.ibm.com 
> 
> Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
> 


-- 
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/

Received on Thursday, 18 October 2007 20:56:47 UTC