- From: Costello, Roger L. <costello@mitre.org>
- Date: Tue, 19 Apr 2011 16:34:07 -0400
- To: "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
Thanks Michael. Very enlightening. I'd like to confirm my new understanding. Here simpleType "A" is the base of simpleType "B": <xs:simpleType name="A"> <xs:restriction base="xs:string"> <xs:pattern value="[a-z]{10}" /> </xs:restriction> </xs:simpleType> <xs:simpleType name="B"> <xs:restriction base="A"> <xs:pattern value="[a-z]{20}" /> </xs:restriction> </xs:simpleType> Here I declare an element, Test, to be of type "B": <xs:element name="Test" type="B" /> The value of "B" must consist of the letters a-z and the length must be exactly 10 characters AND exactly 20 characters. Clearly that is impossible, so B has no valid value. Is that correct thus far? Compare the above two simpleTypes against this simpleType: <xs:simpleType name="C"> <xs:restriction base="xs:string"> <xs:pattern value="[a-z]{10}" /> <xs:pattern value="[a-z]{20}" /> </xs:restriction> </xs:simpleType> I declare an element, Test, to be of type "C": <xs:element name="Test" type="C" /> The value of "C" must consist of the letters a-z and the length must be exactly 10 characters OR exactly 20 characters. So either of these is valid: <Test> abcdefghijabcdefghij</Test> <Test> abcdefghij</Test> Is that correct? Let's return to the "A" and "B" example. I would like to merge them into a single simpleType. I have learned that simply merging the pattern facets of "A" into "B" to yield "C" is not correct. It would be so nice if I could simply write: [a-z]{10} and [a-z]{20} Unfortunately, there is no "and" operator in regex. So, any ideas on how to "and" arbitrary regex expressions? /Roger -----Original Message----- From: C. M. Sperberg-McQueen [mailto:cmsmcq@blackmesatech.com] Sent: Tuesday, April 19, 2011 12:39 PM To: Costello, Roger L. Cc: C. M. Sperberg-McQueen; xmlschema-dev@w3.org Subject: Re: Algorithm for merging the pattern facets in a base simpleType with a subtype? On Apr 19, 2011, at 7:15 AM, Costello, Roger L. wrote: > Hi Folks, > > Suppose that simpleType "A" is the base or simpleType "B": > ... > > Suppose that "A" contains one or more pattern facets: > ... > > What patterns apply to "B"? In general, and informally, for any facet at all, the facet-based constraints on B are the union of those specified on the declaration of B and those B inherits from A. In the XSD spec, the explanation is not quite so simple, because the spec attempts to ensure that the {facets} value 'makes sense'. So the spec is full of extra ad hoc rules which make the story more complicated; some of these unneeded complications affect the pattern facet. In XSD 1.0, section 4.3.4.3 of the Datatypes spec has the following note: Note: It is a consequence of the schema representation constraint Multiple patterns (§4.3.4.3) and of the rules for ·restriction· that ·pattern· facets specified on the same step in a type derivation are ORed together, while ·pattern· facets specified on different steps of a type derivation are ANDed together. Thus, to impose two ·pattern· constraints simultaneously, schema authors may either write a single ·pattern· which expresses the intersection of the two ·pattern·s they wish to impose, or define each ·pattern· on a separate type derivation step. The rules for restriction referred to in the note are laid out explicitly in the Structures spec in Schema Component Constraint: Simple Type Restriction (Facets) in section 3.14.6, which specifies that when the facets specified on B are merged into the set of facets inherited from A, multiple patterns are allowed. So the patterns inherited from A and those specified on B must all be satisfied. In XSD 1.1, the pattern facet is redefined to have as its value a set of regular expressions, instead of a single regular expression, and the XML mapping specified in Datatypes 4.3.4.2 for the pattern facet's {value} property is modified to take any new patterns specified on restrictions and add them to the set inherited from the base type. The facet overlay process defined in Structures 3.16.6.4 is correspondingly simpler. > > I believe there are only two cases to consider: > > CASE 1: "B" does not have any pattern facets. > > Therefore, the patterns that apply to "B" are the pattern(s) contained in "A". Yes, assuming that by "pattern(s) contained in 'A'" you mean "the pattern facet(s) in simple type definition A". If you mean just those lexically present in the source declaration of A, you have inadvertently left out any patterns inherited by A from its base type. > > CASE 2: "B" has one or more pattern facets. > > The patterns in "B" must be a restriction of the patterns in "A". No, there is no constraint in XSD 1.0 or 1.1 that requires a pattern specified in a restriction to have any relation at all to the lexical space of the base type. (This contrasts with the rules for content models, which do impose such requirements.) The rules that new patterns and inherited patterns are ANDed together and that the new pattern does not need to recapitulate constraints already expressed are exploited in the definitions of yearMonthDuration and dayTimeDuration in XSD 1.1, as explained in sections 3.4.26.1 and 3.4.27.1 of XSD 1.1 Datatypes. > Therefore, the patterns that apply to "B" are just the patterns contained in "B". Effectively the patterns in "A" may be ignored. Do you agree? No, sorry, there is nothing in the spec to justify that conclusion. -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Tuesday, 19 April 2011 20:34:35 UTC