W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > July to September 2003

Re: whitespace facet constrains value space of integer???

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: 15 Sep 2003 09:30:11 -0600
To: Dan Connolly <connolly@w3.org>
Cc: www-xml-schema-comments@w3.org
Message-Id: <1063639810.4470.135.camel@localhost>

On Mon, 2003-09-15 at 07:34, Dan Connolly wrote:
> "integer has the following ˇconstraining facetsˇ:
> ...
> whiteSpace
> ..."
>   -- http://www.w3.org/TR/xmlschema-2/#integer
> "[Definition:]  A constraining facet is an optional property that can be
> applied to a datatype to constrain its ˇvalue spaceˇ."
> -- http://www.w3.org/TR/xmlschema-2/#dt-constraining-facet

Oops.  That definition is at best misleading.  Some
constraining facets constrain the value space directly
(e.g. maxInclusive) and some constrain it indirectly,
as a consequence of the constraint they place directly 
on the lexical space (e.g. pattern).  

The whitespace facet is slightly different from other facets 
in that strictly speaking it describes what whitespace
normalization is done during the process of getting from
whatever data is presented at the higher level (in the
usual case: the attribute value in the XML document or
infoset) to the lexical form.  It constrains the value
space and lexical space only indirectly (if at all).

I think we may want to class this as a problem requiring
a clarification.

> So the whitespace facet constrains the *value space*
> of integer somehow?

No.  The whitespace facet won't have any effect on the
value space of integer.  It can have an effect on the
value space of types derived from string (e.g. by guaranteeing
that the lexical space contains no strings which contain
any whitespace characters other than blank).  But when
viewed as a constraint on the value space of integer, it's
a vacuous constraint.

> Please clarify. The RDF Core WG is having a heck
> of a time figuring out whether " 3 " is in the
> lexical space of the integer datatype.
> (This comment is not sent on their behalf, however.)

Ah, well that's simple.  The lexical space of integer
is described thus:

   integer has a lexical representation consisting of 
   a finite-length sequence of decimal digits (#x30-#x39) 
   with an optional leading sign. 

No blanks are mentioned, and no blanks appear in any
lexical form.  The whitespace facet for integer has the value
'collapse' (inherited from decimal), so that if I give 
"  3 " as the value of an attribute declared as having type
xsd:integer, it will be valid.  The XML document contains
"  3 ", the infoset has "  3 " (unless the document has a
DTD which causes normalization to "3"), the lexical form
(after applying the whitespace=collapse rule) is "3", 
and the value, of course, is succ(succ(succ(0))).
As we pointed out in one or the other of our comments to the
RDF group, whitespace normalization is NOT part of type checking
and should be provided for, if appropriate, by the higher-level
system which uses the XSD types.  In our case, the Structures
spec includes the rule that says "do whitespace processing
before type validation"; if an RDF processor should do 
whitespace normalization, the RDF specs need to say so.

The specific comment we made is at

If as you say the RDF Core WG to that comment if as you say
they are having trouble understanding whether "  3 " is a legal
lexical form for integer, I'd be grateful if you could point
them at that comment.

Received on Monday, 15 September 2003 11:30:35 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:50:01 UTC