Re: XML Schema WG responses to RDF Core WG on xmlsch-02 from Brian McBride on 2003-10-07 (www-rdf-comments@w3.org from October to December 2003)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Tue, 07 Oct 2003 16:44:17 +0100
To: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
Cc: www-rdf-comments@w3.org, W3C XML Schema IG <w3c-xml-schema-ig@w3.org>
Message-ID: <3F82DF51.2000605@hplb.hpl.hp.com>
Michael and colleagues,

Thank you again for your efforts and responses to be found in:

http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0011.html

Since it appears that the XML schema group have so far been unable to 
agree a response to RDFCore's disposition of xmlsch-02, it seems to me 
appropriate to record this in our issue disposition document as a "no 
reponse" with a link to this message.

I'm currently inclined to call this out as a question on which we might 
solicit specific feedback at last call, but since we have not had an 
objection, I don't feel the need to represent it as an objection.

However, if anyone feels strongly enough to send in an objection in the 
next very few days, I'll will then try to ensure that objection is 
included in the list of objections when we publish the last call WD's 
which we hope to do very soon now.

I hope you will find this to be a reasonable and fair way of dealing 
with this issue, but please don't hesitate to raise any concerns you may 
have.

Brian



C. M. Sperberg-McQueen wrote:

[...]


> 
>       1.2. Whitespace handling (schema-related) [xmlsch-02
>       <http://www.w3.org/2001/sw/RDFCore/20030123-issues/>]
> 
> Some members of the XML Schema WG have expressed concern that XML 
> Schema's rules for whitespace handling may interfere with expected 
> behavior in other contexts. This may be the appropriate place to bring 
> this question up.
> In brief, XML Schema's simple types each define a whitespace facet, 
> which governs the kind of whitespace pre-processing done by an XML 
> Schema processor before the lexical form is checked for type validity. 
> Since the point of whitespace normalization is to simplify subsequent 
> processing, the lexical spaces of XML Schema's simple types are (like 
> those in many programming languages) defined without reference to the 
> preceding whitespace normalization. Integers, for example, are 
> represented by sequences of decimal digits; sequences containing blanks 
> are not legal lexical forms for integers. Indeed, strictly speaking it 
> is only after the whitespace pre-processing is done that the XML Schema 
> processor can be said to be working with a lexical form at all.
> For example, the integer type has a value of collapse for the whitespace 
> facet, which means leading and trailing whitespace is stripped, and 
> internal whitespace sequences are reduced to a single blank (x20) 
> character. In an XML document in which the element /exterms:age/ is 
> defined as having type /xs:integer/, the following instances of 
> /exterms:age/ will all be type-valid:
> 
> <exterms:age>27</exterms:age>
> <exterms:age>
>   27
> </exterms:age>
> <exterms:age>   27  </exterms:age>
> <exterms:age>   2<!--* ha, ha, fooled your full-text indexer!
> *-->7  </exterms:age>
> 
> The input information set, in each case, contains a character 
> information item for “2” followed by a character information item for 
> “7”, with character information items for whitespace characters, and a 
> comment information item, present in some of the examples. In all cases, 
> the lexical form proper is the character sequence “27” (i.e. the 
> sequence of characters after white space handling, and ignoring 
> comments, processing instructions, entity boundaries, and other 
> distractions). This is a legal lexical form for an integer, so all the 
> examples are type valid.
> Some members of the XML Schema WG have worried that it may not be 
> obvious that the whitespace processing is not part of the process of 
> checking lexical forms for type validity, but part of the process of 
> extracting the lexical forms from the XML information set presented to 
> the processor. If an RDF document contains
> 
> <exterms:age>   27  </exterms:age>
> 
> and a processor hands the contents of the element to a generic 
> type-checker for XML Schema's simple types, saying in effect “this 
> purports to be the lexical form of an integer; is that OK?”, that type 
> checker will be required (if it conforms to the XML Schema spec's 
> definition of the simple types) to say “no, the character sequence 
> ‘   27  ’ is not a legal lexical form for an integer.”
> It's not clear whether RDF, being type-system neutral, can directly 
> address this concern (e.g. by specifying that an RDF processor should do 
> the appropriate whitespace pre-processing, or by warning users that they 
> should not include vagrant whitespace in typed literals), or whether it 
> suffices for developers of RDF software with built-in support for XML 
> Schema's simple types to deal with it, e.g. by performing it themselves 
> before handing the resulting lexical form to a type checker.
> As noted, some members of our WG feel that you need to be alerted to 
> this as a possible source of confusion and unexpected results. Other 
> members of the WG feel that it verges on disrespect to assume that you 
> need instruction on this point. We compromised by agreeing to point out 
> the issue to you, and to leave you to draw your own conclusions.
> *Response from RDF:*
> 
>     The RDF Core WG resolved: xmlsch-02 addressed by msg-0097 where
>     msg-0097 is
>     http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0097.html
>     and says
>     PROPOSE RDF Core accepts the comment xmlsch-02 and agree to add the
>     following test case:
> 
> <rdf:Description rdf:about="http://www.example.org/a">
>    <eg:prop rdf:datatype="&xsd;int">3</eg:prop>
> </rdf:Description>
> 
>     Does not entail
> 
> <rdf:Description rdf:about="http://www.example.org/a">
>    <eg:prop rdf:datatype="&xsd;int"> 3 </eg:prop>
> </rdf:Description>
> 
>     Moreover the following comment to be added to concepts:
> 
>         NOTE: In [XML Schema (part 1)], white space normalization occurs
>         during validation according to the value of the whiteSpace
>         facet. The lexical-to-value mapping used in RDF datatyping
>         occurs after this, so that the whiteSpace facet has no effect in
>         RDF datatyping. 
> 
>     In fact more test cases were desired, and the test cases created are
>     currently awaiting final WG approval and can be found in:
>     http://www.w3.org/2000/10/rdf-tests/rdfcore/xmlsch-02/
>     The Manifest file describes four tests showing that::
> 
>         * A well-formed typed literal is not related to an ill-formed
>           literal. Even if they only differ by whitespace.
>         * A simple test for well-formedness of a typed literal.
>         * An integer with whitespace is ill-formed.
> 
>     The actual text corresponding to the agreed note is found at the end
>     of section 5
>     http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/#section-Datatypes
>     a certain amount of editorial descretion was taken to consolidate
>     notes concerning your comments.
>     The full note from the editors draft is:
> 
>         Note: When the datatype is defined using XML Schema:
>         ...
>         + In [XML-SCHEMA1], white space normalization occurs during
>         validation according to the value of the whiteSpace facet. The
>         lexical-to-value mapping used in RDF datatyping occurs after
>         this, so that the whiteSpace facet has no effect in RDF datatyping.
> 
> *Response from XML Schema*
> 
>     Thank you for your reply.
>     The XML Schema Working Group is in agreement on one point of our
>     reply and divided in our opinion on a second point.
>     First, we are agreed that the position you sketch out is not a
>     source of logical inconsistency which will render your specification
>     meaningless or logically problematic. It is entirely possible for
>     you to handle whitespace in this way.
>     On the second point, our views are divided.
>     A minority of the Working Group believes that you have made a
>     reasonable design choice, given that RDF will only ever be produced
>     by and consumed by software, and that humans and issues of human
>     legibility are not and should not be matters of concern in your design.
>     A larger portion of the Working Group vigorously disagrees and
>     believes that for RDF processors to treat your two test cases
>     differently is to build into RDF a potential for astonishing users
>     and leading to unexpected results which will haunt you and your
>     users for years to come. In this view, it is not as a matter of
>     compatibility with XML Schema, but as a matter of common-sense
>     concern for your users that you should simply say that the
>     whitespace processing specified for the type in question should be
>     performed by any RDF processor.
>     *Overall*: we do not have consensus either to express satisfaction
>     with your resolution of this issue or to raise a formal dissent. In
>     the opinion of our chair, this means there is no formal dissent, but
>     he recommends that this point be listed during the review of formal
>     dissents as an issue on which there was not perfect consensus.
Received on Tuesday, 7 October 2003 12:39:57 UTC