Re: ignorable white space with anyType restriction

Hi Tom,

> Suppose a type t is based on a restriction of anyType (see below).
> Although anyType has mixedContent true, this is not inherited, so
> type t does not allow mixed content.
>
> Now consider the following instance document with element e of type
> t. There is a newline and several spaces between <e> and </e>. Is
> this legal because it is ignorable white space or illegal because it
> is not ignorable white space ?

Interesting question :) First, we have to work out the content type of
the type t. The content type of a complex type with complex content
(like this one) is decided by the following:

  1 If the <restriction> alternative is chosen, then the appropriate
    case among the following:
  1.1 If one of the following is true
  1.1.1 There is no <group>, <all>, <choice> or <sequence> among the
        [children];
  ...
      , then empty;
  ...

So the content type of the type t is 'empty'. Then, looking lower down
at the validation of elements against complex types, we find the
following:


  For an element information item to be locally ·valid· with respect
  to a complex type definition all of the following must be true:
  ...
  2.1 If the {content type} is empty, then the element information
      item has no character or element information item [children].
  ...

                    http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

So the element e is only valid if it doesn't have *any* character or
element information item children. Whitespace characters in the
element content count as character information item children, so the
element e is invalid -- the whitespace is not ignorable because the
element e has been declared as being empty.

If type t had been declared as:

<xs:complexType name="t">
  <xs:complexContent>
    <xs:restriction base="xs:anyType">
      <xs:sequence>
        <xs:element name="f" minOccurs="0" maxOccurs="0" />
      </xs:sequence>
    </xs:restriction>
  </xs:complexContent>
</xs:complexType>

on the other hand, then the content type of type t would not be empty
(it would be a sequence of 0 f elements instead). In this context,
whitespace character information items *are* ignored, and the element
e with whitespace-only content would be valid.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Received on Thursday, 4 July 2002 06:03:28 UTC