XForms Basic and Schema Validation

Hello all,

This is my action item from the telecon to write up what I was saying was
the problem with schemas in XForms Basic.


CONTEXT

In section 3, Conformance, the second bullet indicates that an XFB processor
'may' support a subset of XML Schema that only deals with simple types. A
typical use-case might be for a registration form to define the pattern for
an email address and set some minimum and maximum lengths on a password:

  <xf:model schema="#s">
    <xf:instance>
      <subscriber xmlns="">
        <name />
        <emailaddress />
        <password />
      </subscriber>
    </xf:instance>

    <xf:bind nodeset="name" required="true()" />
    <xf:bind nodeset="emailaddress" type="email" />
    <xf:bind nodeset="password" type="password" />

    <xf:submission
     id="sub-register"
     action="servlet/RegisterUserServlet"
     method="post"
     replace="instance"
    />
  </xf:model>

  <xs:schema id="s" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:simpleType name="email">
      <xs:restriction base="xs:string">
        <xs:pattern value="..." />
      </xs:restriction>
    </xs:simpleType>

    <xs:simpleType name="pass">
      <xs:restriction base="xs:string">
        <xs:minLength value="5" />
        <xs:maxLength value="10" />
        <xs:pattern value="[A-Za-z0-9]*" />
      </xs:restriction>
    </xs:simpleType>
  </xs:schema>

As you can see in this simple example, the submission will only be carried
out if:

  * the name is present;
  * the email address matches the pattern (omitted in the
    example);
  * the password is at least 5 characters long, but no more
    than 10.

Although XML Schema mark-up is used to define the rules, this is really only
leveraging the language, since the actual processing of these rules could
easily be implemented with a simple regular expression evaluator, i.e.,
without needing to implement a full XML Schema processor. This is the
motivation for having at least *some* support for XML Schema in XForms
Basic, because even small mobile devices will be able to do basic pattern
matching.


CONFORMANCE

If this snippet is passed to *any* XForms processor--whether Basic or
Full--they will behave in the same way, meaning that:

  * all invalid data is correctly *not* sent;
  * all valid data is correctly sent.


THE PROBLEM

If we now modify the schema to include a complex type:

  <xs:element name="person">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="name" type="xs:string" />
        <xs:element name="emailaddress" type="email" />
        <xs:element name="password" type="pass" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

and change our instance data as follows:

    <xf:instance>
      <subscriber
       xmlns=""
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:type="person"
      >
        <name />
        <emailaddress />
        <password />
      </subscriber>
    </xf:instance>

We will get different behaviour on different processors:

  * an XForms Full processor will still submit only valid
    instances and not invalid ones;
  * an XForms Basic processor that supports all of XML Schema
    will also submit only valid instances, and not invalid
    ones;
  * an XForms Basic processor that supports the 'subset' of
    XML Schema will either fail to load the form, or fail to
    submit *valid* data.

This means that, at least from the standpoint of the XForms Basic
specification, you cannot guarantee how your form will behave.


UNDEFINED TYPES

The reason that an XFB processor supporting a subset of XML Schema will fail
if @xsi:type="person" is used, is that the processor won't find a definition
for 'person' since all complex types have been ignored. Unfortunately, the
spec is not clear on what exactly happens here--whether a binding exception
should be thrown, or the data is just be marked as invalid (although there
won't be any way to make it valid again!).

However, one thing is certain--the error (using an undefined type) cannot be
ignored. On the telecon earlier today, people kept saying that if a
non-existent type was referenced then the processor should interpret the
node as being of type xsd:string; I don't see how this interpretation was
arrived at, and it's certainly not in the specification. This is critical
because ignoring non-existent types means converting all sorts of mark-up
errors into valid forms. For example, if I say that one of my nodes is
xsd:dec, instead of xsd:decimal, automatically converting this to xsd:string
means that nothing will tell me that there's no such type, and the form will
load, the user will type "hamster", and the instance will be regarded as
completely valid!

So, we have to flag undefined types as errors or mark the instances as
invalid.

(The confusion may be that the spec *does* say that a node with *no* type
information defaults to xsd:string, but that's a different thing
altogether.)


POSSIBLE SOLUTIONS

So, possible solutions are:

  * ignore the problem, but add a note to the specification
    that warns authors that there may be inconsistencies.
    A terrible solution...;

  * just remove the second bullet altogether. Are there any
    XForms Basic implementers that are implementing just simple
    types? If not, then why not just say that XForms Basic is
    *really* basic;

  * accept that there *are* two classes of XForms Basic and so
    introduce another one...XForms Tiny!

There are probably others, too, but those are the ones that jumped out at
me. I actually favour the last one, but to achieve it I think we would have
to move away from the idea of a processor 'may' do this, and may do that. I
think we could define the three levels as something along the lines of:

  * XForms Tiny only supports simple types, and we make it clear
    that complex types are *not* processed. Therefore any reference
    to a complex type will cause an error of some sort. (This error
    is needed in XForms Full as well, for reasons I described
    above). In addition, XForms Tiny supports XML Events Basic;

  * XForms Basic supports *full* schema, but still supports XML
    Events Basic;

  * XForms Full supports full schema, and XML Events *Full*.

Any thoughts on that?

In addition, I think we should add to the conformance functions so that as
well as asking whether the processor is "Full" or "Basic" you can find out
what level of schema and XML Events support it has. (This is how profiles
are done in things like SVG--you can find out about the 'components'.)

Regards,

Mark


Mark Birbeck
CEO
x-port.net Ltd.

e: Mark.Birbeck@x-port.net
t: +44 (0) 20 7689 9232
b: http://internet-apps.blogspot.com/
w: http://www.formsPlayer.com/

Download our XForms processor from
http://www.formsPlayer.com/

Received on Wednesday, 3 May 2006 22:05:37 UTC