Re: Lexical space for unsignedXXX types

On Fri, 2005-06-10 at 07:48, Sandy Gao wrote:
> (Applies to both 1.0 rec and 1.1 drafts.)
> 
> Sections 3.4.21~24.1 indicate that the lexical spaces for the unsigned
> types are "a finite-length sequence of decimal digits (#x30-#x39)",
> which means that signs are not allowed. That is, neither "-0" nor
> "+123" is valid.
> 
> But the Schema for Schemas says that the unsigned types are derived
> from their base types by simply specifying a lower/upper bound, which
> has no impact to the sign in the lexical space.
> 
> Which interpretation is correct?

This came up at the XSL/XMLQuery meetings last week,
and I created a simple test document and schema
(attached), which was run against all the schema
processors I could conveniently find.  The upshot
with respect to leading + in the unsigned* types is:

  xsv allows "+0123" as unsignedInt
  Xerces J allows it
  Xerces C allows it
  MSV allows it
  The Oracle validator allows it

  Saxon 8 flags it as an error

Michael Rys ran it against the SQL Server implementation
of XML Schema, but from his report I'm not certain
whether SQL Server allows it or disallows it.

I think there are two things we can do:  (1) treat
the absence of a pattern facet as an error in the
schema for schemas (since it deviates from our goal
of eliminating magic from all built-in derivations
as far as possible), or (2) treat the prose description
as erroneous in failing to mention any possible sign.

If anyone has evidence (preferably documentary, but
recollections of intent may be the best we can do)
bearing on what was intended, I'd be interested to
see it.

If tests on other processors show the same pattern
as above -- i.e. if the large majority of processors
allow leading signs, that probably counts as an
argument in favor of (2).  Either way, we should
prepare an erratum for XML Schema 1.0 and ensure
that 1.1 aligns with the corrected text.

-CMSMcQ

Received on Tuesday, 26 July 2005 13:48:10 UTC