Re: Major problem with schema needs immediate attention. from Erik Bruchez on 2007-11-02 (public-forms@w3.org from November 2007)

From: Erik Bruchez <ebruchez@orbeon.com>
Date: Thu, 01 Nov 2007 18:53:34 -0700
To: "Forms WG (new)" <public-forms@w3.org>
Message-ID: <472A831E.70703@orbeon.com>
John,

 > The use case for "strict when possible" is really straightforward.

One question: am I missing something and is Leigh proposing that this
decision take place only for the *root element* of an instance? My
understanding was that it was recursive, hence my comparison with lax
processing.

 > Schemas are coming from data architects, who expect that the process
 > contents default *defined* in XML Schema 1.0 will be observed.  That
 > default is strict.  In other words, when the data architect writes a
 > schema, he expects strict unless *he* says otherwise.  He also
 > expects, despite classification as optional to implement, that lax
 > validation will occur within content once a strict validation
 > failure occurs.

I am not disputing that your use case is perfectly valid. But it seems
to me that your particular use case calls for the ability to tell that
your instance must be validated strictly, and not pseudo-lax-strict as
suggested here.

As a note, whatever we do for 1.1, you can already work around the
problem by explicitly specifying a bind, which has to be obeyed:

   <xforms:bind nodeset="/*" type="my:employee"/>

So if your instance has a typo, e.g. <my:employe>, it will be invalid,
while this would not be caught with lax processing.

 > All prior versions of the spec do not say what kind of validation is
 > performed, they simply reference XML Schema 1.0, which by my read
 > means that strict processing is what occurs in the absence of a
 > processContent declarations to the contrary in the schema.

How sure of this are we? If I say:

   ./my-xsd-valiator data.xml schema.xml

with data.xml:

   <foo:baaaar>
     ...
   </foo:baaaar>

and schema.xml specifying a target namespace for foo, but no type for
foo:baaaar, what will happen? And would all validators even agree with
what to do?

 > In the wild, we have had some implementations that have dropped down
 > to lax in obvious cases, like a type lib, that have arisen over time
 > in practice.

And at least one, ours, which reached the conclusion that the default
should be lax and that an extension attribute was necessary ;-)

 > Sadly, none of these implementation experiences landed in the spec,
 > and the feature demand did not percolate up until your LC comment
 > (i.e. nobody has taken the action item to fix the problem in years
 > now).  We have to address the LC comment, which could be done by
 > deferring it or by doing something about it.  If we choose other
 > than defer, we have to be sure that no big objections are going to
 > happen, and removing the strict validation will assuredly cause that
 > to happen.  Maybe fixing it the way we have been discussing it will
 > also cause an objection, e.g.  from Orbeon.  In that case, we're
 > pretty much stuck with deferral.  Frankly, I'd rather defer and lose
 > official, interoperable support of typelibs over going from strict
 > to lax.

 > FWIW, I do like best the approach Leigh is taking because it's clear
 > there are multiple ways to implement and it seems it can be
 > overriden by XSLT 2.0 style attributes later, so we're not getting
 > boxed in now.

But it sounds to me as a hack which we are implementing just because
we don't feel we have the ability to address the real problem in 1.1.

I am not sure XForms users actually have a very strong requirement
for the validation scheme discussed, whether that scheme would
actually be desirable in the big picture of schema validation or
not. Instead, it seems to me that the requirement is rather to allow
for strictly validating certain instances, and skipping some other
instances (which we don't completely tackle with the proposed solution
here, I think, if Leigh's proposal is indeed recursive).

It seems to me that if we could tell users and implementors that they
have a choice, via attribute, of strict, lax, and skip, this would
cover most use cases, with the benefit of not defining anything new in
addition to XML Schema or XSLT 2.0.

I fully understand that we may not be able to do this in XForms 1.1.

 > So the question would be whether you could live with it... :-)

I hate to do this, but what about we do a pass of information
gathering before deciding. What I mean here is:

1. Maybe our understanding of lax is incorrect. Are we 100% sure that
    lax will not consider invalid elements and attributes in a
    namespace for a schema which is in scope but without an exact
    match? My reading of the specs suggests otherwise, and it seems
    that some tools we use do things this way as well, but who knows. I
    would still say that a misunderstanding here is unlikely.

2. It would be great to hear from Schema or XSLT 2.0 people:

    * why lax was defined this way

    * what people do to address use cases where you want to make sure
      elements in a given namespace is required to be defined

    * what do they think of our current problem, i.e. are there good
      arguments in favor of defining a new scheme, or in favor of not
      doing so

Obvious candidates here are Mike Kay and Henri Thompson.

-Erik

-- 
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/
Received on Friday, 2 November 2007 01:54:04 UTC