- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Thu, 02 Jun 2005 13:10:00 +0100
- To: Kasimier Buchcik <kbuchcik@4commerce.de>
- Cc: Xan Gregg <xan.gregg@jmp.com>, xmlschema-dev@w3.org, www-xml-schema-comments@w3.org
Consider: <xsd:simpleType name="fooType"> <xsd:union memberTypes="xsd:string xsd:token"/> </xsd:simpleType> <xsd:simpleType name="fooSubType> <xsd:restriction base="fooType"> <xsd:pattern value="[a-z]"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="foo" type="fooSubType"/> Wrt this instance: <foo> a </foo> I think on balance Kasimier and Xan are _both_ right, and therefore none of the processors are wrong. Here's my reasoning: [First, it has to be noted that the definition of Datatype Valid [1] is broken -- it implies that if there's a *pattern* facet, the string being checked need not be in the lexical space of the type!] One the one hand the process of validation of a restricted union could be understood in two steps -- checking the union, and then enforcing the restriction. This is because without checking the union, we don't know what the string _is_, because the only way we get a string to check is by using a type defn with a whitespace facet. On this account (Kasimier's too, I guess) things go like this: 1) Check Datatype Valid for the pre-lexical form wrt each member of the union in turn: " a " against xs:string -- whiteSpace is preserve, so lexical form is " a ", which _is_ in the lexical space of xs:string, and the corresponding value is in the value space of xs:string, so we win 2) Check the facets of the union: " a " against [a-z] -- fails So, invalid. The alternative reading is that the facets on the union are distributed into the member types of the union, in which case Xan's analysis is correct and things go like this: 1) Check Datatype Valid for the pre-lexical form wrt each member of the union, plus the facets on the union itself, in turn: 1a) " a " against xs:string -- whiteSpace is preserve, so lexical form is " a ", which _is_ in the lexical space of xs:string, and the corresponding value is in the value space of xs:string, so we check the facets check " a " against [a-z] -- fails 1b) " a " against xs:token -- whiteSpace is collapse, so lexical form is "a", which _is_ in the lexical space of xs:token, and the corresponding value is in the value space of xs:token, so we check the facets check "a" against [a-z] -- succeeds I don't believe it's actually at all clear which is correct. This actually interacts with an existing issue, regarding the semantics of a type allowed as the type of e.g. an attribute as part of a complex type derived by restriction from a base type with a restricted union for that attribute (whew!) -- example: <xs:complexType name="base"> <xs:attribute name="foo" type="fooSubType"/> </xs:complexType> <xs:complexType name="restr"> <xs:attribute name="foo" type="xs:token"/> </xs:complexType> Currently this is a) allowed but b) means that the restricted type allows _more_ than the base type, which is not supposed to happen. We should probably solve both these problems together (and the latter issue suggests we'll go in Xan's direction, that is, we'll push the facets down onto all the member types. . .) ht [1] http://www.w3.org/TR/xmlschema-2/#defn-validation-rules -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Thursday, 2 June 2005 12:10:12 UTC