- From: Michael Kay <mike@saxonica.com>
- Date: Thu, 18 Oct 2012 14:15:19 +0100
- To: xmlschema-dev@w3.org
Well, you have defined two value spaces: strings consisting entirely of printable characters, and strings consisting entirely of whitespace characters. The union of those two value spaces is (strings that consist entirely of printable characters or entirely of whitespace characters). A string that mixes printable characters and whitespace characters is not in either value space, therefore it should not be in their union. If you want to compose pattern-based types from reusable components you could do this using entities to build up regular expressions. This is the way that Michael Sperberg-McQueen defined the types that match different flavours of URI in http://www.w3.org/2011/04/XMLSchema/TypeLibrary-URI-RFC3986.xsd and http://www.w3.org/2011/04/XMLSchema/TypeLibrary-IRI-RFC3987.xsd To see the way these complex regular expressions are constructed, view these documents at the raw XML level using (for example) curl. Michael Kay Saxonica On 18/10/2012 13:44, Costello, Roger L. wrote: > Hi Folks, > > Proposition: > A union of member types does not produce > a union, it produces a sequence of member > types. > > Proof: > <xs:simpleType name="white-space-characters"> > <xs:annotation> > <xs:documentation> > The space (SP, ASCII value 32) and horizontal tab (HTAB, > ASCII value 9) characters are known as the white space > characters, WSP. > </xs:documentation> > </xs:annotation> > <xs:restriction base="xs:string"> > <xs:pattern value="[	 ]*" /> > </xs:restriction> > </xs:simpleType> > > <xs:simpleType name="printable-characters"> > <xs:annotation> > <xs:documentation> > The printable US-ASCII characters are the characters that > have values between 33 and 126, inclusive. > </xs:documentation> > </xs:annotation> > <xs:restriction base="xs:string"> > <xs:pattern value="[!-~]*" /> > </xs:restriction> > </xs:simpleType> > > <xs:simpleType name="header-field-body-characters"> > <xs:annotation> > <xs:documentation> > A field body may be composed of printable US-ASCII characters > as well as the WSP. > </xs:documentation> > </xs:annotation> > <xs:union memberTypes="printable-characters white-space-characters" /> > </xs:simpleType> > > <xs:element name="header-field-body" type="header-field-body-characters" /> > > *Valid* instance document: > > <header-field-body>HelloWorld</header-field-body> > > *Valid* instance document: > > <header-field-body> </header-field-body> > > *Invalid* instance document: > > <header-field-body>Hello World</header-field-body> > > Therefore the union of printable-characters and white-space-characters does not yield a union of their value spaces; rather, it merely provides a sequence of two types. > > Ugh. > > So, how do I truly union printable-characters and white-space-characters? > > Of course, I could simply copy the regex pattern from printable-characters and white-space-characters and paste: > > <xs:simpleType name="header-field-body-characters"> > <xs:annotation> > <xs:documentation> > A field body may be composed of printable US-ASCII characters > as well as the WSP. > </xs:documentation> > </xs:annotation> > <xs:restriction base="xs:string"> > <xs:pattern value="[	 !-~]*" /> > </xs:restriction> > </xs:simpleType> > > But that is awful as it is totally disconnected from printable-characters and white-space-characters. > > Any suggestions? > > /Roger > >
Received on Thursday, 18 October 2012 13:15:45 UTC