- From: <noah_mendelsohn@us.ibm.com>
- Date: Thu, 5 Feb 2004 14:52:32 -0500
- To: Hess Yvan <yvan.hess@imtf.ch>
- Cc: "'Bob Schloss'" <rschloss@us.ibm.com>, "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>
Hess Yvan asks: >> The usage of choice occurence combined with element >> seems to be quite complex. Where can I find a good >> documentation about its usage ? Well, the authoritative description is in the XML schema recommendation at [1]. This is a very technical explanation, but it's the final word. There are also some good books on schema. That said, the official rules are not that complicated once you know how to read them. The general rule for repeated elements is the obvious one: <element name="e" minOccurs="3" maxOccurs="5"/> means 3, 4 or 5 elements named "e". Now, what about something like: <sequence minOccurs="1" maxOccurs="2"> <element name="a" /> <element name="b" minOccurs="0" maxOccurs="1"/> </sequence> This matches (I'm not spelling out all the <...>): {a}. {a,b}, {a,b,a}, {a,b,a,b}, {a,a}, {a,a,b} (I think, I may have missed one). Tthe question is, why does this work this way? The answer is in the recommendation at [1] where it says: 3 If the {term} is a model group, then all of the following must be true: 3.1 There is a ·partition· of the sequence into n sub-sequences such that n is greater than or equal to {min occurs}. 3.2 If {max occurs} is a number, n must be less than or equal to {max occurs}. 3.3 Each sub-sequence in the ·partition· is ·valid· with respect to that model group as defined in Element Sequence Valid (§3.8.4). That probably won't make much sense, but what it means is: a) take your instance, for example {a, a, b} b) Since minOccurs=1, maxOccurs=2, try to divide it into either one or two subsequences. If there's one subsequence, it must match the original sequence, but leaving off the outer repeat: <sequence> <element name="a" /> <element name="b" minOccurs="0" maxOccurs="1"/> </sequence> If there are two then each must match the same sequnce <sequence> <element name="a" /> <element name="b" minOccurs="0" maxOccurs="1"/> </sequence> Let's try it with {a,a,b}. Since minOccurs="1", we can try a trivial parition into a single sequence {[a,a,b]}. Does that match as one sequence? No. maxOccurs = "2", so we can also try partitions into two sequences. How about breaking it into {[a,a],[b]}. That doesn't work because [a,a] doesn't match <sequence> <element name="a" /> <element name="b" minOccurs="0" maxOccurs="1"/> </sequence> (and for that matter [b] doesn't either.) How about {[a], [a,b]} ? [a] matches, because the b is optional. [a,b] matches too, so the overall content is valid. If you try that with {a,a,a,b,a,b,b} you'll find there's no such partition, hence it's invalid overall . I illustrated this with a sequence, but the same holds for a choice, except that instead of looking for the partitions to match little sequences, they must match either one or the other of the inner pieces. Thus: <choice minOccurs="1" maxOccurs="2"> <element name="a" /> <element name="b" minOccurs="0" maxOccurs="1"/> </choice> Will accept: {a}, {b}, {a, a}, {a, b}, {b, a}, {b, b} take {b, a}. Consider the partition {[b], [a]}. The [b] matches: <choice> <element name="a" /> <element name="b" minOccurs="0" maxOccurs="1"/> </choice> and so does the [a]. You can generalize this reasoning to all the cases. In general, the trick is to find a partition that each piece of which works against the choice or sequence >without the repeat count<. If you can do that, it's valid. If not, not. It sounds a bit complicated, but once you look at it you'll realize that it's simple and sensible. Well, in my opinion anyway. I hope this helps. Noah [1] http://www.w3.org/TR/xmlschema-1/#section-Particle-Validation-Rules -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Thursday, 5 February 2004 14:52:57 UTC