RE: [w3c sml] [4775] Change "skip" to "lax" processing

I support Sandy's fully articulated position, namely, that only DataType
should have processContents="skip" for its content, and all other
extension points (content and attributes) should have "lax" processing.

 

Pratul, I don't think the parallel between not worrying about the
validity (or invalidity) of the documents and not worrying about the
validity of the extension points is a good one.  Presumably extensions
points are included for a reason and that reason probably has something
to do with the intended processing of the SML-IF document.  Sandy's
point that processors should enforce the SML IF structure does not imply
what you seem to be saying, that they is all an SML IF processor should
do-note Sandy's parenthetical comment.  IMO, those processors that are
capable of understanding the extension points should be able to validate
them.  Given the ubiquity of processContents="lax" in all other industry
standards, I don't think it is much of a concern that all consumers that
encounter extension points are forced to try to validate them.  (As I
understand it, if the schema can't be located, then "lax" becomes in
effect a "skip"-i.e., the extension point is not declared to be
invalid.)

 

Just trying to keep the discussion going.  We seemed to have
crystallized on two positions:

1.	Sandy's position as articulated below (and in the first sentence
above), which follows apparent industry "best practices", but with
sensitivity to the special needs of SMIL-IF
2.	The "pure performance option": maximize performance by skipping
validation on every extension point. 

 

Kirk Wilson, Ph.D.
Research Staff Member

CA Labs

603 823-7146

 

________________________________

From: public-sml-request@w3.org [mailto:public-sml-request@w3.org] On
Behalf Of Pratul Dublish
Sent: Wednesday, September 05, 2007 5:21 PM
To: Sandy Gao; John Arwe
Cc: public-sml@w3.org
Subject: RE: [w3c sml] [4775] Change "skip" to "lax" processing

 

I agree with Sandy that SML IF schema should enforce the SML IF
structure and not worry about the validity (or invalidity) of the
documents contained in an SML IF document.  IMO, the same logic should
be applied to the extension points since the extension points are
provided for extensibility but are irrelevant to the structure of the
SML IF document.  My understanding of processContents="lax" is that
processors will attempt to find the schema and, if successful,  perform
validation. Therefore, lax validation on extension points will require
all consumers (or more precisely the XML Schema processor used by
consumers) to attempt to locate the schema for extension points and
validate them. In fact, a producer who uses extension points can force
all consumers to validate them by including the extension point schemas
in the IF document.  So, we should retain skip processing for the
extension points in SML IF schema.

 

From: public-sml-request@w3.org [mailto:public-sml-request@w3.org] On
Behalf Of Sandy Gao
Sent: Monday, August 27, 2007 6:47 AM
To: John Arwe
Cc: public-sml@w3.org
Subject: Re: [w3c sml] [4775] Change "skip" to "lax" processing

 


I agree with Kirk that this is (partially) a performance issue, because
"lax" allows/requires (1.0/1.1) the processor to try to assess the
entire subtree, whereas "skip" says "do nothing". 

But there is a deeper issue here: what do we want the SML-IF schema to
enforce? I think the answer should be to make sure the document satisfy
the SML-IF *structure* (and any additional contracts/extensions between
processors). That is, if a document being transmitted is invalid, it
should *not* be a violation of the SML-IF schema. The IF is OK in this
case. (Just like "The Moon is bigger then the Sun" is OK English-wise.) 

What this means is that whether it should be lax or skip depends on what
the wildcard is supposed to match: 

- If it's for *extension" points (so that additional information can be
attached to the SML-IF instance, to be interpreted by processors who
understand it), then "lax" should be used, in case the processor has a
schema that can provide components to validate the matching
elements/attributes. 

- If it's a place-holder for the document being transmitted, then "skip"
should be used, so that we don't let validity of individual document to
affect the overall IF validity. 

Based on this, it seems that only "DataType" needs a "skip" wildcard for
its content (not attribute), and all the others should be "lax". 

BTW, why did "DataType" have, as its content: 

      <xs:any namespace="##other" processContents="skip" minOccurs="0"
maxOccurs="unbounded"/> 

I would think it should be

      <xs:any processContents="skip"/> 
      <xs:any namespace="##other" processContents="lax" minOccurs="0"
maxOccurs="unbounded"/> 

That is, we expect the first element to be the document being
transmitted, which can have any namespace (including that for SML-IF).
This one is "skip" because we don't care about its validity. This
element must appear once and only once. Then there are any numbers of
additional elements that can be used for extension purposes, hence
"##other" and "lax". 

Thanks,
Sandy Gao
XML Technologies, IBM Canada
Editor, W3C XML Schema WG <http://www.w3.org/XML/Schema/> 
Member, W3C SML WG <http://www.w3.org/XML/SML/> 
(1-905) 413-3255 T/L 969-3255



John Arwe <johnarwe@us.ibm.com> 
Sent by: public-sml-request@w3.org 

2007-08-24 12:01 PM 

To

<public-sml@w3.org> 

cc

 

Subject

Re: [w3c sml] [4775] Change "skip" to "lax" processing

 

 

 





yeah, what he said (+1 from me) 

I never understood why we would prevent a validator from using schema
components it could locate (skip), as long as they are not required
(lax). 

Best Regards, John

Street address: 2455 South Road, Poughkeepsie, NY USA 12601
Voice: 1+845-435-9470      Fax: 1+845-432-9787 

"Wilson, Kirk D" <Kirk.Wilson@ca.com> 
Sent by: public-sml-request@w3.org 

08/24/2007 10:49 AM 

 

To

<public-sml@w3.org> 

cc

 

Subject

[w3c sml] [4775] Change "skip" to "lax" processing

 

 

 





All, 
 
The email will serve to initiate the discussion of whether we should
specify processControl="lax" rather than the current "skip" for
wildcards in the SML XML Schemas. 
 
There is only one occurrence of processControl="skip" in the SML
specification: for the content of the smlerr:errorDataType. 
 
The original issue was raised with respect SML-IF in which
processControl="skip" is used for all extension elements (both xs:any
and xs:anyAttribute) in the type definitions of this specification. 
 
Since I wasn't involved in the original authorship of the spec, I'm not
sure what the rationale was for the original use of "skip".  I assume it
was for efficiency of the SML-IF consumer, the assumption being that the
SML-IF would need to concern itself only with sml elements according to
the semantics specified in SML-IF.  In my notes I have found the
following definition of an SML-IF consumer: "processes SML-IF documents
in whole or in part by the semantics of this specification" (emphasis
added).  Since, by definition, extension elements lie beyond the
semantics of the spec, there appears to be no reason for the processor's
attempting to validate the extension elements.  But I would consider
this a poor argument. 
 
"Skip" seems too finalistic and may not meet the requirements of SML-IF
consumer creators and SML-IF document authors who need to build in
special information, eg., into the ModelType, and can provide the schema
for validation (assessment).  I suspect that something like this
rationale underlies what appears to be the industry "best practice" of
using "lax" processing.   The cost of using lax processing is
undoubtedly absolutely minimal. 
 
I will recommend changing the spec to "lax", according to what is
industry best practice.   
 
Kirk Wilson, Ph.D.
CA Inc.
Research Staff Member, CA Labs
Intellectual Property and Standards
Council of Technical Excellence
W3C Advisory Committee Representative 
Tele: + 1 603 823-7146
Fax:   + 1 603 823-7148
<mailto:kirk.wilson@ca.com> 
  

Received on Thursday, 6 September 2007 13:19:36 UTC