- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Thu, 02 Jun 2005 13:10:00 +0100
- To: Kasimier Buchcik <kbuchcik@4commerce.de>
- Cc: Xan Gregg <xan.gregg@jmp.com>, xmlschema-dev@w3.org, www-xml-schema-comments@w3.org
Consider:
<xsd:simpleType name="fooType">
<xsd:union memberTypes="xsd:string xsd:token"/>
</xsd:simpleType>
<xsd:simpleType name="fooSubType>
<xsd:restriction base="fooType">
<xsd:pattern value="[a-z]"/>
</xsd:restriction>
</xsd:simpleType>
<xsd:element name="foo" type="fooSubType"/>
Wrt this instance:
<foo> a </foo>
I think on balance Kasimier and Xan are _both_ right, and therefore
none of the processors are wrong.
Here's my reasoning:
[First, it has to be noted that the definition of Datatype Valid [1]
is broken -- it implies that if there's a *pattern* facet, the string
being checked need not be in the lexical space of the type!]
One the one hand the process of validation of a restricted union
could be understood in two steps -- checking the union, and then
enforcing the restriction. This is because without checking the
union, we don't know what the string _is_, because the only way we get
a string to check is by using a type defn with a whitespace facet.
On this account (Kasimier's too, I guess) things go like this:
1) Check Datatype Valid for the pre-lexical form wrt each member of
the union in turn:
" a " against xs:string -- whiteSpace is preserve, so
lexical form is " a ", which
_is_ in the lexical space of
xs:string, and the corresponding
value is in the value space of
xs:string, so we win
2) Check the facets of the union:
" a " against [a-z] -- fails
So, invalid.
The alternative reading is that the facets on the union are
distributed into the member types of the union, in which case Xan's
analysis is correct and things go like this:
1) Check Datatype Valid for the pre-lexical form wrt each member of
the union, plus the facets on the union itself, in turn:
1a) " a " against xs:string -- whiteSpace is preserve, so
lexical form is " a ", which
_is_ in the lexical space of
xs:string, and the corresponding
value is in the value space of
xs:string, so we check the facets
check " a " against [a-z] -- fails
1b) " a " against xs:token -- whiteSpace is collapse, so
lexical form is "a", which
_is_ in the lexical space of
xs:token, and the corresponding
value is in the value space of
xs:token, so we check the facets
check "a" against [a-z] -- succeeds
I don't believe it's actually at all clear which is correct.
This actually interacts with an existing issue, regarding the
semantics of a type allowed as the type of e.g. an attribute as part
of a complex type derived by restriction from a base type with a
restricted union for that attribute (whew!) -- example:
<xs:complexType name="base">
<xs:attribute name="foo" type="fooSubType"/>
</xs:complexType>
<xs:complexType name="restr">
<xs:attribute name="foo" type="xs:token"/>
</xs:complexType>
Currently this is a) allowed but b) means that the restricted type
allows _more_ than the base type, which is not supposed to happen.
We should probably solve both these problems together (and the latter
issue suggests we'll go in Xan's direction, that is, we'll push the
facets down onto all the member types. . .)
ht
[1] http://www.w3.org/TR/xmlschema-2/#defn-validation-rules
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Thursday, 2 June 2005 12:10:11 UTC