- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Fri, 07 Feb 2003 14:42:16 -0700
- To: <allan.jones@hyfinity.com>
- Cc: "xml schema" <xmlschema-dev@w3.org>
[Pardon the very late reply; I hope this may still be of interest to someone.] At 2003-01-10 09:13, Allan Jones wrote: > Hi Jeni, > thanks for your suggestion, but the problem with that technique is > that you can declare sub-elements with the same name, which would > effectively mean that you'd declare something like: > <test> > <element> > <element> > <terminal_element/> > </element> > </element> > </test> > as an infinite loop when the schema would merely specify it as a > nested element. Well, no. Jeni suggested noting element declarations, not element generic identifiers; if the element instances denoted by /test/element and test/element/element are instances of the same element declaration, then the schema does indeed show a cycle in the schema; if the instance is valid, then the content model of 'element' can clearly be either an 'element' element or a 'terminal_element' element, so the cycle is not pernicious. > Using the complex type definitions (as they're usually the reason > that an infinite loop occurs) leads to similar problems, because > it's possible that the same complex type could be used in different > parts of the document so the fact that you've 'hit' it twice doesn't > mean that it's an infinite loop. If an instance of a type can contain another instance of the same type, then arbitrarily deep nesting is possible. If the instance MUST contain a second instance of the same type, then infinitely deep nesting is required and the content model is unsatisfiable. (Content models of this kind have been called Elephant Child content models for this reason; the Elephant Child, you will recall, suffered from insatiability.) I think by 'infinite loop' you mean not just a set of declarations which allow arbitrarily deep nesting, but a set which are insatiable because they require infinitely deep nesting. As Jeni pointed out, to find Elephant Children you need to keep track of whether each descendant is required or not. This is not actually very hard for sequences (though I agree it's fiddly); it's somewhat harder for choices. I'm not quite sure how best to go about detecting vicious cycles in schema documents using XSLT, but your original question just said 'programmatically', so I'll sketch a strategy one could use. The key characteristic of an Elephant Child type T is that it is defined in such a way that the content of every instance of T must contain another instance of T. I'd find such types by attempting to prove that a type is not an Elephant Child. If I can't prove it, then it is an Elephant Child. The proof strategy involves the predicate ec_free(X), which applies to types, sequences, choices, all-groups, and element particles. Applied to types: ec_free(T) is: - if T is a simple type, true. - if T is a complex type, true if the topmost group of T is ec_free. Applied to groups, ec_free(G): - if G has minOccurs = 0, then true - else if G is a sequence (a, b, ... z), then true iff a, b, ... z are all ec_free. - else if G is a choice (a | b | ... | z), then true iff at least one of a, b, ... z is ec_free. - else if G is an all-group (a & b & ... & z) then true iff a, b, ... z are all ec_free. Applied to element particles, ec_free(E): - if E has minOccurs = 0, then true - else if E has complex type T, and we are in the process of trying to prove whether T is ec_free, then false. - else if E has complex type T, and we are not already trying to prove whether T is ec_free, then true iff ec_free(T) Does this help? -C. M. Sperberg-McQueen
Received on Friday, 7 February 2003 17:58:48 UTC