Re: Major problem with schema needs immediate attention. from Erik Bruchez on 2007-10-19 (public-forms@w3.org from October 2007)

From: Erik Bruchez <ebruchez@orbeon.com>
Date: Fri, 19 Oct 2007 16:24:52 -0700
To: "Forms WG (new)" <public-forms@w3.org>
Message-ID: <47193CC4.3060703@orbeon.com>
It seems that "being a type lib" is not the same as "no top-level 
xs:element", because you can have top-level attributes defined 
(xs:attribute). Am I right?

In a lax validation process, these attribute definitions will be used 
for validation.

-Erik

Klotz, Leigh wrote:
> Just cutting to the chase here:
>>Are you proposing that XForms 1.1 should inspect each schema to 
> determine if it is a type lib (e.g. no top-level xs:element), and use 
> something other than strict validation in that case?  
> Yes. That appears to be what existing implementations do and if our goal 
> with 1.1 is to fix these inconsistencies then we can do it with what I 
> believe are the two rules that are being used:
>  
> 1. If there are no elements in the Schema for a namespace, then it 
> cannot be used for structural validation and *should not try*.
> 2. If there are elements defined in the schema, then it is presumed to 
> define them all (the one schema file per namespace rule).
> These two rules seem sufficient to me and are allowed by case two of 
> http://www.w3.org/TR/xmlschema-1/#validation_outcome and correspond to 
> the fine-grained schema checking that I believe we are talking about.
> The question of whether when you have a Schema for a namespace and 
> encounter elements in that namespace that are not in the schema is the 
> strict/lax one, and the Schema people say it's up to the application to 
> say this as well, but since we have the one-schema-per-namespace rule, 
> we can defensibly say that we do not allow lax processing so that should 
> be invalid.
> I suppose a reasonable person might say that we allow schemas for the "" 
> namespace so we should allow lax processing there, and if you want that 
> I won't disagree:
> 2.5 If you encounter an element in the "" namespace and the element is 
> not mentioned in any schema for that namespace, you cannot prove it invalid.
>  
> Test cases (note these are not additional rules but tests of the above):
> If you encounter an element in a namespace, if you have a schema with 
> that target namespace and it defines elements, then you can validate 
> that element.
> If you encounter an element in a namespace, if you have a schema with 
> that target namespace and does not defines elements, then you cannot 
> validate that element (except via bind/@type <mailto:bind/@type> or 
> xsi:type but that's a separate issue).
> if you have no schema for that namespace, then you cannot validate it 
> (though you may be able to validate its insides recursively, or do 
> bind/@type <mailto:bind/@type> validation but that's a separate matter), 
> so it's not invalid.
> If the root element XYZ in the instance is in a namespace and you have a 
> schema for that namespace, then either you can validate every child by 
> that schema, or it's invalid.
> If you had wanted lax processing of child elements of an element (even 
> if said element is our root XYZ) you should have specified such in your 
> schema, which you wrote.
> If you had wanted elements in some other namespace to appear as child 
> elements of said element, even if you went to the trouble to mention a 
> schema for that other namespace in the model/@schema 
> <mailto:model/@schema> attribute, you still won't get validity because 
> the schema defining the XYZ element doesn't allow it.
>  
> If our goal is to give more control to authors about strict vs lax then 
> we need to re-vamp things for 1.2, though I hope we do it in a way that 
> deals with multiple Schema languages.
>  
> Leigh.
> ------------------------------------------------------------------------
> *From:* John Boyer [mailto:boyerj@ca.ibm.com]
> *Sent:* Friday, October 19, 2007 2:29 PM
> *To:* Klotz, Leigh
> *Cc:* ebruchez@orbeon.com; Forms WG (new); public-forms-request@w3.org
> *Subject:* RE: Major problem with schema needs immediate attention.
> 
> 
> I agree with Erik's last call comment and his most immediate response 
> below saying that the spec has a problem because it does not say enough 
> about how to apply the schemas (which I have mutated slightly into the 
> conclusion that it does not say which schemas to apply to an instance 
> nor how to apply them).
> 
> I agree with Leigh (and Noah) that strict vs. lax is more fine grain 
> than instance level.  In fact, I pointed this out in the prior email by 
> indicating that the XSLT 2 "validation" attribute would be better as a 
> MIP because this would map more closely to the design of XSLT 2, which 
> allows validation to appear on element and attribute declarations, not 
> just on result tree declarations.
> 
> I agree with Leigh that it would be better for XForms 1.1 to say 
> something more about the method without adding more "knobs and dials". 
>  This is why my first email on this pointed out a method we have been 
> using to select schema.  It is not an ideal method but could be refined 
> to our needs.
> 
> It remains the case that the method I mentioned *does not do* what Erik 
> describes. Or, at least, I don't see how it does :-)  It is not a method 
> for "one-level lax validation".  It is a method for *selecting* the 
> schema that will be applied.  Those that are applied are strict.  So the 
> choice of lax or strict is a separate issue.  XSLT 2 "validation" 
> describes how to apply *all* of the schema.  
> 
> However, let's go back to what Leigh just said:
> 
> "Each xf:mode/@schema contributes either a type library or a schema 
> definition that applies only to nodes in its targetNamespace."
> 
> Leigh, the problem we are having is that the above statement is 
> insufficient.  Could you tell me *how* each schema definition is applied 
> to nodes in its target namespace?
> 
> The easiest example is that if you have a "type library" schema for a 
> target namespace, then an instance in that namespace will fail a strict 
> validation, preventing submission.  It would be necessary to use lax 
> validation for the type lib.  On the other hand, if the schema contains 
> a structure definition for the target namespace, then lax validation 
> will fail to enforce some of the expected structure rules.
> 
> Are you proposing that XForms 1.1 should inspect each schema to 
> determine if it is a type lib (e.g. no top-level xs:element), and use 
> something other than strict validation in that case?  
> 
> Cheers,
> John M. Boyer, Ph.D.
> STSM: Lotus Forms Architect and Researcher
> Chair, W3C Forms Working Group
> Workplace, Portal and Collaboration Software
> IBM Victoria Software Lab
> E-Mail: boyerj@ca.ibm.com  
> 
> Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
> 
> 
> 
> 
> *"Klotz, Leigh" <Leigh.Klotz@xerox.com>*
> Sent by: public-forms-request@w3.org
> 
> 10/19/2007 01:50 PM
> 
> 	
> To
> 	<ebruchez@orbeon.com>, "Forms WG (new)" <public-forms@w3.org>
> cc
> 	
> Subject
> 	RE: Major problem with schema needs immediate attention.
> 
> 
> 	
> 
> 
> 
> 
> 
> 
> I have no quibbles with trying to do a better job in XForms 1.2, and in 
> further alignment with XSLT 2.0 and XPath 2.0.
> Such changes can only help us, but they are as you say, most assuredly 
> not for XForms 1.1.
> 
> But it's not clear to me that this is a major problem that needs 
> immediate attention, at least not in terms of making last-minute changes 
> to add new knobs to twiddle on xf:model and xf:instance.
> 
> If anything is needed, it's verbiage about how the processor decides to 
> use lax validation.
> 
> 1. Each xf:mode/@schema contributes either a type library or a schema 
> definition that applies only to nodes in its targetNamespace.
> 2. Verbiage, if needed, could take the form of a template for an 
> <xsd:schema> that expresses the above, given a list of Schemas to import 
> or include.
> 
> Some XForms processors do just that.  Others (I believe the Novell 
> engine was reported to to this at the SAP F2F in Palo Alto), combine the 
> Schemas in an additive form not expressible using xsd:import or xsd:include.
> 
> And again, lax or strict is not necessarily a decision to be made on the 
> document or instance level; as you saw from Noah Mendelsohn's 
> discussion, it's possible to do this on a more fine-grained basis.  
> 
> Leigh.
> 
> 
> 
> -----Original Message-----
> From: public-forms-request@w3.org [mailto:public-forms-request@w3.org] 
> On Behalf Of Erik Bruchez
> Sent: Friday, October 19, 2007 1:23 PM
> To: Forms WG (new)
> Subject: Re: Major problem with schema needs immediate attention.
> 
> 
> Leigh,
> 
> I think (John please correct me if I am wrong) that John and I both
> recognize that XForms is in charge of deciding how to apply the
> schema.
> 
> The issue is that the current spec doesn't say how to apply the
> schema, and I think that we must say how, otherwise different
> implementations will validate differently, right?
> 
> Picking a default (i.e. strict, or John's "one-level-lax") resolves
> the problem in a way, but not in a very satisfactory way IMO, because
> there are use cases for "strict", use cases for "lax", and use cases
> for "just skip" - three concepts already present in the Schema 1.0
> specification.
> 
> This is why I find the XSLT 2.0 solution compelling: it does say, and
> in very much details, "how" to validate, and it allows the stylesheet
> author to better control how schema definitions are applied.
> 
> -Erik
> 
> Klotz, Leigh wrote:
>  > I don't see the need for any changes.  The XML schema processor doesn't
>  > say what interfaces must be provided by the XML schema validation
> software.
>  > This issue is assuming that the policy is strict validation and that the
>  > presence of even a single type library with no element declarations
>  > would invalidate all instances.
>  > That's simply not the case; the application itself (in this case,
>  > XForms) is in charge of deciding how to apply XML Schema validation.
>  > Granted, if you use Xerces in Java and just say "go" it will try to
>  > validate everything, but others, (I believe Saxon-SA) won't, and offer
>  > more advanced interfaces.
>  > But any issues I see here are about using OTS open source software with
>  > insufficienct interfaces to implement what's clearly allowed by the XML
>  > Schema standard.
>  >
>  > Take a look at what Noah has to say on this issue:
>  >
>  >  From http://www.schemavalid.com/faq/xml-schema.html#d4
>  >
>  >
>  >     Can schemas validate parts of an instance document?
>  >
>  > Yes, for example XSV <http://www.w3.org/2000/09/webdata/xsv>, for
>  > example, will use "strict" mode if every element from the root down is
>  > schema-validatable, but "lax" mode if the root node - or any other
>  > element which is allowed to appear in some context - cannot itself be
>  > schema-validated.
>  >
>  > [Noah Mendelsohn] From xmlschema-1
>  > <http://www.w3.org/TR/xmlschema-1/#validation_outcome>: "With a schema
>  > which satisfies the conditions expressed in Errors in Schema
>  > Construction and Structure (§7.1) above, the schema-validity of an
>  > element information item can be assessed.". It then goes on to say
>  > exactly how and against which declarations. Note that it says you can
>  > validate an "element", not necessarily the root element of a document.
>  >
>  > Net answer to your question: conforming processors can be written to
>  > validate any element you like. Not all processors need provide this
>  > service: buy or use processors that validate the information you need
>  > validated. By the way, the detailed rules give the processor a choice of
>  > validating the element against some particular identified element
>  > declaration, some particular identified complex type, or to use the
>  > mechanisms of strict, lax etc. to determine what to validate based on
>  > what declarations happen to be available. All of this is explained at
>  > xmlschema-1 <http://www.w3.org/TR/xmlschema-1/#validation_outcome>.
>  >
>  >
>  >
>  > ------------------------------------------------------------------------
>  > *From:* public-forms-request@w3.org [mailto:public-forms-request@w3.org]
>  > *On Behalf Of *John Boyer
>  > *Sent:* Thursday, October 18, 2007 5:51 PM
>  > *To:* ebruchez@orbeon.com
>  > *Cc:* Forms WG (new); public-forms-request@w3.org
>  > *Subject:* Re: Major problem with schema needs immediate attention.
>  >
>  >
>  > Hi Erik,
>  >
>  > Regarding validate vs. schema attribute on instance, are you saying that
>  > if you have
>  >
>  > A.xsd: schema targetnamespace="A"
>  > B.xsd: schema targetnamespace="B"
>  >
>  > <instance validation="lax">  <e xmlns="A"/>
>  >
>  > Then both schema A and B will still apply to the instance but both will
>  > be applied with lax validation?
>  >
>  > This compared to
>  >
>  > <instance schema="A.xsd"> <e xmlns="A"/>
>  >
>  > To me, the latter is more compelling.  It directly says what schema are
>  > applicable, not how to apply the schema.  It does get even more
>  > compelling the more instances (and hence schema) become involved.
>  >
>  > Frankly, I do actually think there is a also a use for the validation
>  > attribute you advocate, which is interesting because it is another
>  > datapoint to suggest that the two are separate things.  Still a third
>  > point would be that XSLTs validation attribute is designed much more
>  > like our type MIP.  It is actually applicable anywhere in the result
>  > tree, so putting it on instance may be the *wrong* choice.  I could
>  > easily see schema on instance and validation as a MIP.
>  >
>  > Anyway, regarding 'not invented here' syndrome, I'd have to say the
>  > Forms team has done a pretty good job of demonstrating that we don't
>  > have the problem.  Proof points would be XPath and XML Schema.  Even
>  > though both create a few rough edges for us, they solved many many more
>  > problems than they created.  I think the same will be true of things
>  > like XSLT2's validation attribute (only that's a much smaller scale).
>  >  If we find we need it, it'll get pulled in, but if I had to guess then,
>  > as I said above, it would probably be as a MIP.
>  >
>  > That still leaves us with selecting schemas to apply.  To that end, I
>  > would say that we should not be so worried about 'not invented here'
>  > syndrome that we refuse to adopt new ideas *because* another group
>  > didn't think of it first, even when faced with a problem in the same
>  > domain.
>  >
>  > John M. Boyer, Ph.D.
>  > STSM: Lotus Forms Architect and Researcher
>  > Chair, W3C Forms Working Group
>  > Workplace, Portal and Collaboration Software
>  > IBM Victoria Software Lab
>  > E-Mail: boyerj@ca.ibm.com
>  >
>  > Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
>  >
>  >
>  >
>  >
>  > *Erik Bruchez <ebruchez@orbeon.com>*
>  > Sent by: public-forms-request@w3.org
>  >
>  > 10/18/2007 04:08 PM
>  > Please respond to
>  > ebruchez@orbeon.com
>  >
>  >
>  >                  
>  > To
>  >                  "Forms WG (new)" <public-forms@w3.org>
>  > cc
>  >                  
>  > Subject
>  >                  Re: Major problem with schema needs immediate attention.
>  >
>  >
>  >                  
>  >
>  >
>  >
>  >
>  >
>  >
>  > John & all,
>  >
>  > Sorry for the over-long reply below.
>  >
>  >  > Michael Sperberg-McQueen gave a talk at XML 2005 in which he did
>  >  > indeed describe the appropriateness of deciding to use different
>  >  > schema on the same data for different purposes. It is absolutely
>  >  > under the control of the application to decide whether and which
>  >  > schema to apply to *any* XML, and it has nothing at all to do with
>  >  > having a "pseudo-lax" schema method.  What I believe we are talking
>  >  > about is how the application called XForms should decide whether and
>  >  > which schema to apply to pieces of XML that it manages.  Once the
>  >  > schema are selected, the validation is strict.
>  >
>  > Agreed, XForms has 100% control over how it decides to apply schema
>  > validation.
>  >
>  > However, a design principle I believe is good consists in leveraging
>  > or reuse as much as possible from existing (good) work. This has many
>  > benefits, including less time spent devising new solutions to existing
>  > problems (the XForms WG really doesn't have the bandwidth to lose time
>  > on such things), and consistency across specifications, in this case
>  > between XSLT 2.0 and XForms.
>  >
>  > This is the exact same reason I am very much against us defining new
>  > XPath functions to solve problems that already have been solved by
>  > XPath 2.0. I really don't want to duplicate the work or, even worse,
>  > propose solutions that are not better than existing ones but that are
>  > simply different.
>  >
>  > So yes, we can devise our own way of applying schemas to instance
>  > data, but if XSLT 2.0 has done something that can work for us, then I
>  > think that we should do a maximum to adopt that. And I believe at the
>  > moment that XSLT 2.0 does solve our issues so I am pushing things in
>  > that directly.
>  >
>  > The bottom line: I would like to make sure we are not having a case of
>  > "not invented here" syndrome.
>  >
>  >  > For my own part, I presented the method we currently use to work
>  >  > around the problem using a "common sense works pretty well but not
>  >  > perfectly" approach that does not happen to require extension
>  >  > attributes unsupported by the XForms schema.
>  >
>  > The reason I used "pseudo-lax" was BTW not to be derogatory, but
>  > because the "lax" validation algorithm does what you propose, except
>  > it recurses down the XML document to test all the elements and
>  > attributes. So your solution is a sort of "one-level lax" validation
>  > mode, if you prefer ;-)
>  >
>  >  > You've proposed that perhaps we should use attributes like those in
>  >  > XSLT 2.0.  Based on what I've seen in your last call comment and in
>  >  > a brief look at Section 19 of XSLT2, it doesn't look appropriate for
>  >  > XForms 1.1.  It seems like the XSLT2 attributes solve a different
>  >  > problem.
>  >
>  > I don't think so, and that's what I have been trying to argue!
>  >
>  >  > 1) One problem is that XML Schema 1.0 does not mandate that
>  >  > implementations provide an execution mode other than strict, and I
>  >  > know at least one mainstream schema engine that could support the
>  >  > XSLT 2.0 attributes as applied to XForms, which may not even have
>  >  > the same semantic as XSLT2 in any case.
>  >
>  > Are you saying that it would be an issue to have to require tighter
>  > integration with the schema validator, like XSLT mandates?
>  >
>  > If so I can relate to that. For example, at our last f2f, I expressed
>  > surprise at the (very late) realization that we cannot simply use a
>  > stock XPath validator to implement the depency system!
>  >
>  > However, my experience is that you can implement lax validation fairly
>  > easily with some existing validators, in our case MSV. Lax validation,
>  > if not directly supported by your validator, requires that you to
>  > obtain from the validator a list of top-level types, and then the
>  > capability of validating a sub-tree according to a type when you find
>  > a matching element or attribute.
>  >
>  >  > 2) The validate attribute seems to hit the wrong problem, the
>  >  > modality of the schema engine as opposed to the applicability of
>  >  > schema to the instance data.  I have long felt that XForms 1.0 has
>  >  > the design flaw that the schema attribute is attached to the model
>  >  > and not to the instance element.
>  >
>  >  > I think this may be left over from the good old days when a model
>  >  > only had one instance.  If we did go with a solution for XForms 1.1
>  >  > that added markup, I would rather see this:
>  >  >
>  >  > <instance id="X" schema="X.xsd Y.xsd #inline-schema"> ...
>  >
>  >  > If you don't include a schema attribute on an instance, then I
> think no
>  >  > schema should applicable to it.
>  >  > The schema attribute on instance would make the selection of
> schema for
>  >  > the instance direct and explicit, and it makes processing most
> efficient.
>  >
>  > I agree partly with this.
>  >
>  > However it is intersting to note that XSLT 2.0 actually does things
>  > very much the XForms way by allowing you to import a number of schemas
>  > into a stylesheet! Then XSLT defines with attributes how those schemas
>  > apply to a resulting XML documents. We have a very similar situation
>  > in XForms, except our resulting documents are instances. Really, the
>  > parallel is striking to me!
>  >
>  > Given what XSLT 2.0 has done, I now think that importing schemas at
>  > the top-level in an XForms model is perfectly acceptable, as long as
>  > we add to instances attributes similar to what was done in XSLT.
>  >
>  > Also note that XSLT allows you to also specify a @type attribute, if
>  > you really want to mandate a particular type for a document. This
>  > would do the job of selecting which exact schema definition, from the
>  > list of imported schemas, must apply to the root element of the
>  > instance.
>  >
>  > In addition, just adding a schema or list of schemas on an instance
>  > does seem less powerful than what XSLT 2.0 allows you do to.
>  >
>  >  > 3) I also didn't like the validation attribute because I didn't feel
>  >  > adding it to XForms is not really "futureproof".  We have long felt
>  >  > that we need to make the schema engine a pluggable component of
>  >  > XForms.  The validation attribute and its values are very XML Schema
>  >  > centric, i.e.  they configure the processing model of the XML schema
>  >  > engine, so the attribute would be useful and possibly even confusing
>  >  > when another schema engine is being used.  By comparison, a schema
>  >  > attribute on instance is just a schema selector to indicate which
>  >  > schema are applicable to the instance, so it is schema engine
>  >  > neutral.
>  >
>  > This is a good point.
>  >
>  > I agree we need to make sure we can be as schema-neutral as possible,
>  > and at least not close doors. This may be unfortunately, as Leigh
>  > suggested during the last call, something we must work on after 1.1
>  > though.
>  >
>  > We now have schemas imported on the xforms:model element. We can't
>  > really get rid of this feature easily I think. And we have xsi:type
>  > processing taking place.
>  >
>  > Still, I can see how such attributes could have a defined meaning only
>  > with certain schema languages. But even with Relax NG, a @type
>  > attribute can have meaning.
>  >
>  >  > In conclusion, if we could settle on a method a little more like
>  >  > what I first proposed, it might help us to provide guidance for
>  >  > XForms 1.0 processors today, not just XForms 1.1.
>  >
>  > I get this point. I don't dislike your solution, and it is simple, but
>  > I really dislike the fact that it doesn't seem to match something done
>  > in other specs, specifically XSLT 2.0. It is also not just a subset of
>  > what XSLT 2.0 does, i.e. if you then decide that lax validation is
>  > what you want by default (as we do now in Orbeon Forms), then the
>  > outcome may be different from just checking the root element.
>  >
>  >  > But if a new attribute is required, it seems to be schema and not
>  >  > validate that is needed.
>  >
>  > Given my blah-blah above, at the moment I don't agree with this last
>  > statement.
>  >
>  > -Erik
>  >
>  > --
>  > Orbeon Forms - Web Forms for the Enterprise Done the Right Way
>  > http://www.orbeon.com/
>  >
>  >
>  >
> 
> 
> -- 
> Orbeon Forms - Web Forms for the Enterprise Done the Right Way
> http://www.orbeon.com/
> 
> 
> 
> 


-- 
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/
Received on Friday, 19 October 2007 23:25:27 UTC