Whitespace normalization for union types

Hi,

I noticed a difference in processing of the following example with:

Xerces-J 2.6.2 - eats it
XSV 2.8        - eats it
MSXML 4.0      - reports an error: 
  "The element: 'foo'  has an invalid value according to its data type."

Schema:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
	<xsd:simpleType name="fooType">
		<xsd:union memberTypes="xsd:string xsd:token"/>
	</xsd:simpleType>
	<xsd:element name="foo">
		<xsd:simpleType>
			<xsd:restriction base="fooType">
				<xsd:pattern value="[a-z]"/>
			</xsd:restriction>
		</xsd:simpleType>
	</xsd:element>
</xsd:schema>

Instance:
<foo> a   </foo>

Due to [1]...
"For all datatypes ·derived· by ·union·  whiteSpace does not apply
directly; however, the normalization behavior of ·union· types is
controlled by the value of whiteSpace on that one of the ·memberTypes·
against which the ·union· is successfully validated."

...I assume that the whitespace of xs:string is used here; thus the
value " a   " should not be accepted by the pattern "[a-z]". Can someone
confirm this?
MSXML 4.0 seems to reflect this, the other processors not.

The fact that the whitespace-value is at hand when the value was
already validated against the member-types, seems to contradict with
[2] Datatype Valid, which mandates the pattern facet to be applied
first; but without the whitespace-value, normalization is not possible,
so applying the pattern facet is not possible as well.
Can someone clarify this?

[1] http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace
[2] http://www.w3.org/TR/xmlschema-2/#defn-validation-rules

Regards,

Kasimier

Received on Wednesday, 1 June 2005 16:20:15 UTC