W3C home > Mailing lists > Public > public-schemata-users@w3.org > December 2006

Re: multiple pattern facet conjunction

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Sat, 30 Dec 2006 10:55:22 -0700
Message-Id: <6DECCD18-A8D8-43AE-A6E5-1EFFFC058486@acm.org>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, public-schemata-users@w3.org
To: Syd_Bauman@Brown.edu

On 30 Dec 2006, at 04:46 , Syd Bauman wrote:

> The text of seems problematic.
>    If multiple <pattern> element information items appear as
>    [children] of a <simpleType>, the [value]s should be combined as
>    if they appeared in a single regular expression as separate
>    branches.
> First, I am under the (perhaps erroneous) impression that a <pattern>
> element can not be the child of a <simpleType> element.

I think that's true; Schema 1.0 had a typo ('simpleType' for  
-- not 'children' for 'descendant', though, since simple type  
can nest).  That may be one reason that the paragraph in question
has been deleted from the current draft of XML Schema 1.1 and
the rule has been reworded.

> Second, the idea seems unhelpful. If I wanted two regular expressions
> R1 and R2 to appear in a single regular expression as separate
> branches, I could have just written "R1|R2", no?

Yes.  But not if you wished to annotate the two branches
separately, either for a human reader or for a machine.

> So my gut instinct
> is that this rule isn't useful, but I may be missing something.

It doesn't enlarge the expressive power of the language, as
regards validation, no.

> The note attached to says
>    ... pattern facets specified on the same step in a type derivation
>    are ORed together, while pattern facets specified on different
>    steps of a type derivation are ANDed together.
> but I have yet to really figure out what a "step" is.

A step is one derivation in a derivation chain.

When one defines type T1 as a restriction of some primitive
type, and T2 as a restriction of T1, and T3 as a restriction of
T2, one has a derivation chain with three steps.  If patterns
P1 and P2 are specified as part of the definition of T1, and
P3 and P4 as part of the definition of T2 and T3 respectively,
then the lexical space of T3 contains only character
sequences which match P1|P2 and P3 and P4.

>   <xs:element name="duck">
>     <xs:simpleType>
>       <xs:restriction>
>         <xs:simpleType>
>           <xs:restriction base="xs:token">
>             <xs:pattern value="R1"/>
>             <xs:pattern value="R2"/>
>           </xs:restriction>
>         </xs:simpleType>
>       </xs:restriction>
>     </xs:simpleType>
>   </xs:element>
> My instinct is that this could be simplified to
>   <xs:element name="duck">
>     <xs:simpleType>
>       <xs:restriction base="xs:token">
>         <xs:pattern value="R1"/>
>         <xs:pattern value="R2"/>
>       </xs:restriction>
>     </xs:simpleType>
>   </xs:element>
> without any change to the set of documents that would be considered
> valid.

Yes.  In the second formulation, 'duck' is a restriction of token; in
the second formulation, 'duck' is a vacuous restriction of an
anonymous type which is a restriction of token.

I hope this helps.

--C. M. Sperberg-McQueen
Received on Saturday, 30 December 2006 17:55:37 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:39:08 UTC