- From: Syd Bauman <Syd_Bauman@Brown.edu>
- Date: Sat, 30 Dec 2006 06:46:26 -0500
- To: public-schemata-users@w3.org
Previous-subject: "Re: [oXygen-user] wrong conjunction for multiple pattern facets?" The following is a follow-on of a discussion that has been occurring on the oxygen-user mailing list (http://oxygenxml.com/mailman/listinfo/oxygen-user/). It seems pretty clear that in RelaxNG, multiple occurrences of a <param name='pattern"> inside a single <data> element (whose type= must be a W3C datatype that allows the pattern facet) must all be met, i.e., they are ANDed together. The following is from section 2 of "Guidelines for using W3C XML Schema Datatypes with RELAX NG"[1] If the 'pattern' parameter is specified more than once for a single 'data' element, then a string matches the 'data' element only if it matches all of the patterns. I think this means that if I have <rng:element name="duck" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <rng:data type="token"> <rng:param name="pattern">R1</rng:param> <rng:param name="pattern">R2</rng:param> </rng:data> </rng:element> then the content of <duck> must match both R1 and R2 in order to be valid. This seems to make a lot of sense. After all, if I had wanted a string to be a valid <duck> if it matched R1 *or* R2, I could have written <rng:data type="token"> <rng:param name="pattern">(R1)|(R2)</rng:param> </rng:data> But in W3C XML Schema things seem a lot less clear, although this may be because I am close to the furthest thing there is from an expert. I was referred to section 4.3.4.3 of the spec[2]. I had never heard of, let alone read, 4.3.4.3 before today. But upon reading it, I have to admit I don't quite understand what it means, and whether or not it has any significance with respect to RelaxNG validation. (I suspect not.) The text of 4.3.4.3 seems problematic. If multiple <pattern> element information items appear as [children] of a <simpleType>, the [value]s should be combined as if they appeared in a single regular expression as separate branches. First, I am under the (perhaps erroneous) impression that a <pattern> element can not be the child of a <simpleType> element. Although perhaps the infoset definition of "children" includes descendants? (I don't think it does -- I had thought "appearing immediately within the current element" meant child, not descendant.) Second, the idea seems unhelpful. If I wanted two regular expressions R1 and R2 to appear in a single regular expression as separate branches, I could have just written "R1|R2", no? So my gut instinct is that this rule isn't useful, but I may be missing something. (E.g., perhaps this is a general idea which, although not very useful with regular expressions, is expected to be useful with some future structures not yet devised?) The note attached to 4.3.4.3 says ... pattern facets specified on the same step in a type derivation are ORed together, while pattern facets specified on different steps of a type derivation are ANDed together. but I have yet to really figure out what a "step" is. However, playing around a bit with the output of `trang`[3] is potentially very instructive. The following is the above RelaxNG schema fragment transformed into W3C Schema; in the one test I performed (using xmllint) it validated as I wanted: the contents of <duck> must match both R1 and R2. <xs:element name="duck"> <xs:simpleType> <xs:restriction> <xs:simpleType> <xs:restriction base="xs:token"> <xs:pattern value="R1"/> </xs:restriction> </xs:simpleType> <xs:pattern value="R2"/> </xs:restriction> </xs:simpleType> </xs:element> A minor change, as follows, caused strings matching either R1 or R2 to be considered valid. <xs:element name="duck"> <xs:simpleType> <xs:restriction> <xs:simpleType> <xs:restriction base="xs:token"> <xs:pattern value="R1"/> <xs:pattern value="R2"/> </xs:restriction> </xs:simpleType> </xs:restriction> </xs:simpleType> </xs:element> My instinct is that this could be simplified to <xs:element name="duck"> <xs:simpleType> <xs:restriction base="xs:token"> <xs:pattern value="R1"/> <xs:pattern value="R2"/> </xs:restriction> </xs:simpleType> </xs:element> without any change to the set of documents that would be considered valid. Have I got any of this right? Note ---- [1] Which I found at http://relaxng.org/xsd-20010907.html; it is linked to from the main RelaxNG home page. [2] http://www.w3.org/TR/xmlschema-2/#src-multiple-patterns [3] Version 20030619.
Received on Saturday, 30 December 2006 11:46:39 UTC