Subsetting XML Schema - Technical Details

Background Briefing on XForms-Schema Integration, part 2

(4 April 2001 Version)


Abstract

This document outlines some of the specific issues and possible directions for choosing a subset of XML Schema to be used as part of XForms Basic.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document has no official W3C status.

Public comment on this document is welcome on the XForms mailing list <www-forms@w3.org>. To subscribe, send an email to <www-forms-request@w3.org> with the word subscribe in the subject line (include the word unsubscribe if you want to unsubscribe). The archive for the list is accessible online.

Members of the XML Schema and XForms Working Groups may wish to use the w3c-forms-schema mailing list (Members only).


1 Introduction

Mission statement and Scope:

For the XForms Basic Processor, DEFINE THE MINIMUM REQUIRED SUPPORT for:

  1. built-in datatypes,
  2. constraints on the XML representation of the form data
    1. data
    2. structures
  3. composition of the schema for the form data.

Principles and Goals

Assumptions

2 Subset of XML Schema Datatypes

Based on the above assumptions and principles, a minimally conforming XForms processor MUST support the following XML-Schema datatypes:

* All built-in primitive types except: "float" and "double" that require IEEE floating points (see Open Issues below), and "QName" and "NOTATION" because they are XML types.

* All built-in derived types that are derived from the supported built-in types, except the XML types: Name, IDREFS, ID, etc.

* All constraining facets except "pattern" (see Open Issue below)

In addition we need a NOTE saying that the maximum size (significant digits) of "decimal" is platform/device dependant.

Open Issues and further work (datatypes)

* It is unclear whether "pattern" (regexp) is a Critical feature or not.

For small devices regexp is expensive to support (especially if it is not used very often) but not impossible. Currently, XForms has adopted the WML "format" attribute which is a simple expression. We need to investigate further whether a subset of regexp can be defined, whether we need regexp at all, or whether the WML "format" can be used with XML Schema.

We want to investigate support for "float" and "double" without support for IEEE floating points. There seems to be no issues with including those types, so long as it can be done in a way that doesn't drag in binary IEEE float support.

Support for any one of the XML types isn't perceived as difficult. The debate over including them as a whole, hinges on weighing the incremental burden on small devices against the benefit of supporting them. That is, whether that many more types, even simple to implement, are worth supporting.

3 Subset of XML Schema Structures

Structural Semantics needed by XForms:

One way or another we need the following functionality. Our joint task force will identify how much of this functionality can/should be based on Schema:

+ allow form data to be placed within both elements and attributes in instance data

+ facilitate reuse of types and elements defined in other Schemas

+ support definition of simple types with list; restriction; union

+ support "limited" definition of complex types

? don't know if we need "abstract" types

- no support for mixed content model for complexType

Options to represent this syntax in XForms:

3 Alternatives:

[full-schema] Pure subset of Schema-1 and Schema-2. The subset used in XForms would be one fully conforming Schema per form.

[elem-att] Smaller subset of Schema-1 and Schema-2. The subset used in XForms would be element and attribute declarations.

[types] Subset of Schema-2 for datatypes, with existing binding expressions plus instance data definition of structure. This subset used in XForms would be one or more fully conforming Schema datatype fragments per form.

[annotate] Annotation of <instance> with XForms specific properties The subset used in XForms would be several attribute values corresponding to included XML Schema functionality per form.

4 Examples:

IMPORTANT: These are NOT concrete processing or syntax proposals! They serve only to identify specific subsets of XML Schema to be used within XForms.

For visual clarity, the following examples are not namespace-correct. Any solution we agree upon would include namespace conformance.

That said, the xsd: prefix below represents existing XML Schema nodes, xsi: represents schema-for-instances, no prefix represents existing XForms nodes (or instance data), and the xform: prefix represents "invented" nodes.

1st Example: ATOMIC DATATYPE

<!-- submitted instance data -->
<a>
  <b>foo</b>
</a>

Let's put a constraint on the content of <b>:

<!-- Option [full-schema] -->
  <instance>
    ... as above ..
  </instance>
  <model>
    <xsd:schema ...>
      <xsd:element name="a">
        <xsd:complexType>
          <xsd:sequence>
            <xsd:element name="b">
              <xsd:simpleType>
                <xsd:restriction base="..">
                  <xsd:length .../>
                </xsd:restriction>
              </xsd:simpleType>
            </xsd:element>
          </xsd:sequence>
        </xsd:complexType>
      </xsd:element>
    </xsd:schema>
  </model>
<!-- Option [elem-att] -->
  <instance>
    ... as above ...
  </instance>
  <model>
    <xsd:element name="a">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="b" type=".." xform:length=".."/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </model>
<!-- Option [types] -->
  <instance>
    ... as above ..
  </instance>
  <bind id=".." ref="a/b">
    <xsd:restriction base="..">
      <xsd:length .../>
    </xsd:restriction>
  </bind>
<!-- Option [annotate] -->
  <instance>
    <a>
      <b xsi:type=".." xform:length="..">foo</b>
    </a>
  </instance>

2nd Example: LINE ITEM GROUPS

A more complex (but commonly requested) example.

<!-- instance data -->
  <instance>
    <order>
      <line_item>
        <part>123</part>
        <quantity>3</quantity>
        <price cur="USD">1.23</price>
      </line_item>
    </order>
  </instance>

Let's express that we want multiple <line_items> with typed children:

<!-- Option [full-schema] -->
  <instance>
    ... as above ..
  </instance>
  <model>
    <xsd:schema ...>
      <xsd:element name="order">
        <xsd:complexType>
          <xsd:sequence>
            <xsd:element name="line_item" maxOccurs="..">
              <xsd:sequence>
                <xsd:element name="part" type="..."/>
                  <xsd:element name="quantity" type="..."/>
                    <xsd:element name="price">
                      <xsd:complexType>
                         <xsd:simpleContent>
                           <xsd:restriction base="...">
                             <xsd:attribute name="cur" .../>
                           </xsd:restriction>
                         </xsd:simpleContent>
                      </xsd:complexType>
                    </xsd:element>
                  </xsd:sequence>
                </xsd:element>
              </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
      </xsd:schema>
    </model>
<!-- Option [elem-att] -->
  <instance>
    ... as above ...
  </instance>
  <model root="order">
    <xsd:element name="line_item" maxOccurs="..">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="part" type=".."/>
          <xsd:element name="quantity" type=".."/>
          <xsd:element name="price" type="..">
            <xsd:complexType>
              <xsd:attribute name="cur" type=".."
            </xsd:complexType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </model>
<!-- Option [types] -->
  <instance>
    ... as above ..
  </instance>
  <bind id=".." ref="order/line_item" maxOccurs="..">
    <bind id=".." ref="part">
      <xsd:restriction base=".."/>
    </bind>
    <bind id=".." ref="quantity">
      <xsd:restriction base=".."/>
    </bind>
    <bind id=".." ref="price">
      <xsd:restriction base=".."/>
    </bind>
    <bind id=".." ref="price/@cur"/>
  </bind>
<!-- Option [annotate] -->
  <instance>
    <order>
      <line_item xform:maxOccurs="..">
        <part xsi:type="..">123</part>
        <quantity xsi:type="..">3</quantity>
        <price cur="USD" xsi:type="..">1.23</price>
      </line_item>
    </order>
  </instance>
  <!-- Note that attributes (here 'cur') cannot be mapped to form controls with [annotate]. -->

5 Benefits/Drawbacks

[full-schema] Advantages

[full-schema] Disadvantages

[elem-att] Advantages

[elem-att] Disadvantages

[types] Advantages

[types] Disadvantages

[annotate] Advantages

[annotate] Disadvantages