- From: Sandy Gao <sandygao@ca.ibm.com>
- Date: Mon, 27 Aug 2007 09:46:40 -0400
- To: John Arwe <johnarwe@us.ibm.com>
- Cc: public-sml@w3.org
- Message-ID: <OF451A04DC.09781455-ON85257344.00484A95-85257344.004BAEDD@ca.ibm.com>
I agree with Kirk that this is (partially) a performance issue, because "lax" allows/requires (1.0/1.1) the processor to try to assess the entire subtree, whereas "skip" says "do nothing". But there is a deeper issue here: what do we want the SML-IF schema to enforce? I think the answer should be to make sure the document satisfy the SML-IF *structure* (and any additional contracts/extensions between processors). That is, if a document being transmitted is invalid, it should *not* be a violation of the SML-IF schema. The IF is OK in this case. (Just like "The Moon is bigger then the Sun" is OK English-wise.) What this means is that whether it should be lax or skip depends on what the wildcard is supposed to match: - If it's for *extension" points (so that additional information can be attached to the SML-IF instance, to be interpreted by processors who understand it), then "lax" should be used, in case the processor has a schema that can provide components to validate the matching elements/attributes. - If it's a place-holder for the document being transmitted, then "skip" should be used, so that we don't let validity of individual document to affect the overall IF validity. Based on this, it seems that only "DataType" needs a "skip" wildcard for its content (not attribute), and all the others should be "lax". BTW, why did "DataType" have, as its content: <xs:any namespace="##other" processContents="skip" minOccurs="0" maxOccurs="unbounded"/> I would think it should be <xs:any processContents="skip"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> That is, we expect the first element to be the document being transmitted, which can have any namespace (including that for SML-IF). This one is "skip" because we don't care about its validity. This element must appear once and only once. Then there are any numbers of additional elements that can be used for extension purposes, hence "##other" and "lax". Thanks, Sandy Gao XML Technologies, IBM Canada Editor, W3C XML Schema WG Member, W3C SML WG (1-905) 413-3255 T/L 969-3255 John Arwe <johnarwe@us.ibm.com> Sent by: public-sml-request@w3.org 2007-08-24 12:01 PM To <public-sml@w3.org> cc Subject Re: [w3c sml] [4775] Change "skip" to "lax" processing yeah, what he said (+1 from me) I never understood why we would prevent a validator from using schema components it could locate (skip), as long as they are not required (lax). Best Regards, John Street address: 2455 South Road, Poughkeepsie, NY USA 12601 Voice: 1+845-435-9470 Fax: 1+845-432-9787 "Wilson, Kirk D" <Kirk.Wilson@ca.com> Sent by: public-sml-request@w3.org 08/24/2007 10:49 AM To <public-sml@w3.org> cc Subject [w3c sml] [4775] Change "skip" to "lax" processing All, The email will serve to initiate the discussion of whether we should specify processControl=?lax? rather than the current ?skip? for wildcards in the SML XML Schemas. There is only one occurrence of processControl=?skip? in the SML specification: for the content of the smlerr:errorDataType. The original issue was raised with respect SML-IF in which processControl=?skip? is used for all extension elements (both xs:any and xs:anyAttribute) in the type definitions of this specification. Since I wasn?t involved in the original authorship of the spec, I?m not sure what the rationale was for the original use of ?skip?. I assume it was for efficiency of the SML-IF consumer, the assumption being that the SML-IF would need to concern itself only with sml elements according to the semantics specified in SML-IF. In my notes I have found the following definition of an SML-IF consumer: ?processes SML-IF documents in whole or in part by the semantics of this specification? (emphasis added). Since, by definition, extension elements lie beyond the semantics of the spec, there appears to be no reason for the processor?s attempting to validate the extension elements. But I would consider this a poor argument. ?Skip? seems too finalistic and may not meet the requirements of SML-IF consumer creators and SML-IF document authors who need to build in special information, eg., into the ModelType, and can provide the schema for validation (assessment). I suspect that something like this rationale underlies what appears to be the industry ?best practice? of using ?lax? processing. The cost of using lax processing is undoubtedly absolutely minimal. I will recommend changing the spec to ?lax?, according to what is industry best practice. Kirk Wilson, Ph.D. CA Inc. Research Staff Member, CA Labs Intellectual Property and Standards Council of Technical Excellence W3C Advisory Committee Representative Tele: + 1 603 823-7146 Fax: + 1 603 823-7148 <mailto:kirk.wilson@ca.com>
Received on Monday, 27 August 2007 13:46:59 UTC