Strict and lax schema validation

All,

XForms 1.1 mentions in 4.3.5:

    "the node satisfies all applicable XML schema definitions
    (including those associated by [...] an external or an inline
    schema [...])

I would like a precision on what "applicable" means for schema types
when types are not assigned with xforms:bind or xsi:type. If I do a
correct job below, then you will see that things are currently not
clear.

Assume a schema defining types in the:

   xmlns:foo="http://example.org/schema/foo"

namespace (that schema does not import other schemas). I import the
schema into a model with three instances:

   <xforms:model schema="foo.xsd">

     <xforms:instance id="instance-1">
       <foo:form>
         ...
       </foo:form>
     </xforms:instance>

     <xforms:instance id="instance-2">
       <bar:form>
         ...
       </bar:form>
     </xforms:instance>

     <xforms:instance id="instance-3">
       <foo:undefined>
         ...
       </foo:undefined>
     </xforms:instance>

   </xforms:model>

Assume foo.xsd defines only a complex type for foo:form, but no type
for foo:undefined, and no type for bar:form (bar maps to a different
namespace, and there is no schema for "bar").

It seems to me that foo:form for sure has an "applicable" schema
definition. So no problem here.

Now what's the algorithm for bar:form? It certainly doesn't have any
schema definition: there is no schema for namespace "bar". So I guess
this means there is no "applicable" definition for bar:form. However,
you could decide to recurse the tree and validate further, or not.

Now what about foo:undefined? There is a schema for namespace "foo",
but none for the type foo:undefined. Does this mean there there is an
applicable definition or not? Same thing, you could decide to recurse
the tree and validate further, or not.

Now I shouldn't even have to discuss this in these terms, because XML
schema provides options already, see "3.10.1 The Wildcard Schema
Component" in [1].

There are three ways you can process a subtree in schema:

   *strict*

     There must be a top-level declaration for the item available, or
     the item must have an xsi:type, and the item must be ·valid· as
     appropriate.

   *skip*

     No constraints at all: the item must simply be well-formed XML.

   *lax*

     If the item has a uniquely determined declaration available, it
     must be ·valid· with respect to that definition, that is,
     ·validate· if you can, don't worry if you can't.

So it seems to me that here the question is whether we process
instances with "strict" or "lax" processing (and possibly "skip" as
well).

XSLT 2.0 solves the problem by providing a "validation" attribute [2]
which can have value "strict", "lax", and two others ("preserve" and
"strip") which are probably not relevant here.

XSLT 2.0 also provides a "type" attribute, exclusive with
"validation". In XForms, we have a similar situation: we can
explicitly bind a type to the root element of an instance with
xforms:bind or with xsi:type. If we do, everything is fine. If we
don't, THEN we want to specify whether we want strict or lax
validation.

It seems to me then, if my understanding is correct, that there is an
omission in XForms at this point regarding how validation is
performed. We should:

1. Explicitly specify whether instance validation is performed in
    "strict" or "lax" mode when no type is assigned to the root element
    of the instance.

2. Ideally, provide an option to specify whether "strict", "lax" or
    even "skip" mode is applied. This could be done with a simple
    attribute on xforms:instance:

      <xforms:instance validation="lax">

3. Possibly, also provide the option for "skip" mode, to specifically
    exclude an instance from validation.

Comments on this are of course welcome.

-Erik

[1] http://www.w3.org/TR/xmlschema-1/
[2] http://www.w3.org/TR/xslt20/#validation

-- 
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/

Received on Thursday, 31 May 2007 22:56:39 UTC