- From: Morris Matsa <mmatsa@us.ibm.com>
- Date: Tue, 19 Nov 2002 18:12:00 -0500
- To: ht@cogsci.ed.ac.uk (Henry S. Thompson)
- Cc: "Bob Schloss" <rschloss@us.ibm.com>, "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>
Henry, it seems that you might have a different understanding of lax wildcards than we read from the spec, given your assumption at the end of your mail. We've always been unsure about this, so I'd like to use this occasion to ask. First, I'd like to be specific about the example, so here's a schema: <xs:element name="parent"> <xs:complexType> <xs:sequence> <xs:any processContents="lax"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="x" type="xs:positiveInteger"/> <!-- There is no declaration for element "var" --> A valid instance looks like this: <parent> <var> <x>123</x> </var> <parent> The question comes up with this instance: <parent> <var> <x>-123</x> </var> <parent> Is the "parent" element valid? Your note seems to indicate that it is not valid (x not a positiveInteger), and XSV agrees with you (surprise), however we read the spec as saying that it is valid. Here's why we read it that way: There are three validity values in the PSVI: for the "parent" element, the "var" element, and the "x" element. First let's make an assumption which we'll actually question a bit at the end: The element "x" is invalid. Given that the type is positiveInteger and the value negative, it seems pretty clear that if assessed the validity will be invalid. Next let's make another assumption: The "var" element, since it is matching a lax wildcard, is laxly assessed. Again, we'll question this assumption at the end. [The key here is that the context-determined declaration is empty because of the wildcard validation rule. Complicated details aside, it might seem natural that this is the outcome: lax wildcards are assessed laxly.] Given our assumptions, the next goal is to figure out the validity value in the PSVI for the element "var". The rule for filling in this PSVI value is at [1]. As we assumed that the "var" element was laxly assessed, clearly it was not strictly assessed (and this we won't question later), so clause 1 does not apply. Thus, clause 2 sets the value to "notKnown". (Note that XSV differs in it's -r PSVI dump, and lists this as "invalid", so perhaps we've already made a mistake in our analysis.) Next, let's move on to the "parent" element. The only thing that would keep it's validation rule from validating is the wildcard (there are no ID/IDREF/Identity Constraints/Attributes/etc.), so we can look to the Wildcard Validation rule. [2] This wildcard accepts any namespaces, thus the rule can not fail. In fact, given the "any" nature of the wildcard, the wildcard rule merely sets the value of "context-determined declaration", in this case to be absent, but has no way to fail. Next, we consider the assessment of the "parent" element as a whole. [3] 1.1.1.3.1 and 1.1.1.3.2 are true since {"","parent"} is declared in the schema. This makes 1.1.1.3 true, and thus 1.1.1 true. 1.1.2 and 1.1.3 are true because the Wildcard rule did not fail, as discussed just above from [2]. Thus 1.1 is true and clause 1 is true as well, which according the definition just below in the spec [4] means that it has been strictly assessed. Finally, let's evaluate the validity value of the "parent" element [1]. Since it was strictly assessed, we use clause 1. 1.1.1.1 is true, and we assume that 1.1.1 means to say that it is an OR and thus 1.1.1 is true. 1.1.2 is true because, as we derived, the element "var" has validity "notKnown" and not "invalid". 1.1.3 is true because it's only child, the "var" element, has a "context-determined declaration" which is absent (set in the "parent" element's wildcard validation rule, as mentioned.) Thus, it is not valued at "mustFind". Thus, 1.1.3 is true, so 1.1 is true, and thus the "parent" element's validity is "valid". Once again, XSV disagrees with us, listing the validity of (parent, var, x) as (invalid, invalid, invalid) respectively, where we seem to get (valid, notKnown, invalid) respectively. Postscript: I mentioned above that we would 'at the end' question our two assumptions. While it seems clear that the "var" element is not strictly assessed, as explained above, is it be laxly assessed? The relevant quote from the spec seems to be the line just under [4], where the spec says: If the item cannot be "strictly assessed", because neither clause 1.1 nor clause 1.2 above are satisfied, [Definition:] an element information item's schema validity may be "laxly assessed" if its " context-determined declaration" is not "skip" by "validating" with respect to the "ur-type definition" as per "Element Locally Valid (Type) (3.3.4)" The key word in this sentence for us is "may" in part of "may be laxly assessed", which does not say "must". Thus, it seems that a schema-aware parser could not laxly assess the "var" element at all. This would not affect the validity of the "var" element by our above analysis which would still be "notKnown" because [1] is consistent as long as the element was not strictly assessed, laxly assessed and not-assessed are the same as far as validity is concerned. It would also not affect the validity of the "parent" element which is "valid" for the same reasons. However, if the "var" element is not even laxly assessed, then the "x" element would not even be recursively assessed, and thus not end up with validity "invalid". If we're reading this right, it would be optional for a processor to not identify "x" as invalid, but only optional, and either way not affect the validity of the "parent" element, so it's really a separate clarification question. I think that's it for now. We're ready for you to point out the constraint that we're missing that avoids all these problems. Depending on your answer we might ask a certain follow-up question. [1] http://www.w3.org/TR/xmlschema-1/#section-Element-Declaration-Information-Set-Contributions [2] http://www.w3.org/TR/xmlschema-1/#cvc-wildcard [3] http://www.w3.org/TR/xmlschema-1/#cvc-assess-elt [4] http://www.w3.org/TR/xmlschema-1/#key-sva ht@cogsci.ed.ac.uk (Henry S. Thompson)@w3.org on 11/19/2002 02:13:13 PM Sent by: xmlschema-dev-request@w3.org To: Bob Schloss/Watson/IBM@IBMUS cc: "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org> Subject: Re: variable element names I think you've identified a bug in Noah's solution. The processContents attribute of wildcards is not inherited. So given <parent> <var> <x>123</x> <w><a/></w> <z/> </var> <parent> and writing <xs:element name="parent"> <xs:complexType> <xs:sequence> <xs:any processContents="strict"/> </xs:sequence> </xs:complexType> </xs:element> in its schema, we would have a requirement that a declaration for the _var_ element be available. There's no way to make that requirement inherited. So what you actually want is <xs:element name="parent"> <xs:complexType> <xs:sequence> <xs:any processContents="lax"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="x" type="..."/> <xs:element name="y" type="..."/> <xs:element name="z" type="..."/> By using 'lax' you avoid the (undesirable for your example) requirement for a declaration for 'var', but because lax validation _is_ recursive, and declarations for x, y and z _are_ present, requires that they conform to those declarations. ht -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh W3C Fellow 1999--2002, part-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Tuesday, 19 November 2002 18:21:31 UTC