[Bug 1406] New: Serialization of float and double via decimal problematic

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1406

           Summary: Serialization of float and double via decimal
                    problematic
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Platform: PC
        OS/Version: Windows 2000
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Functions and Operators
        AssignedTo: ashok.malhotra@oracle.com
        ReportedBy: holstege@mathling.com
         QAContact: public-qt-comments@w3.org


Serialization for floats and double in the range .000001 <= x < 1000000
is defined by first casting the value to decimal and then following the
serialization rules for decimal. [Ref: FO 17.1.2 Casting to xs:string and
xdt:untypedAtomic] 

The problem with this is that decimal is only guaranteed to support 18 digits,
so serialization of some float and double values can result in numbers that
cannot be supported by most implementations, resulting in the error
"err:FOCA0001, input value too large for decimal".
[Ref: FO 17.1.3.3 Casting to xs:decimal]

It seems like a serious misfeature to cause a query to fail in serializing a
value which one has carefully chosen to have a sufficient number of bits.
Worse, since validation may be defined in terms of serialization
[Ref QLang 3.13 Validate Expressions and DM 4 Infoset Mapping], validation can
cause a query to fail on data that would actually validate.

Possible solutions:
(a) Forget the pretty printing. If you want pretty numbers, provide
    a number formatting function, or use strings.  If it is necessary
    for XPath compatibility, only do it in XPath 1.0 compatibility mode.
    Thus the serialization of a float or double value would always be
    in scientific notation.
(b) Disallow serialization as the basis for validation. This does not solve
    the general problem with serializations of float and double values, but 
    at least guarantees that an instance valid against a schema can be validated 
    against that same schema successfully, which strikes us a fairly
    fundamental invariant.
(c) Define a value-preserving non-cosmetic variant of serialization and specify
    that serialization of a data model for the purposes of constructing an
    infoset for validation must use this value-preserving variant.
(d) Require more than 18 digits of support for decimal, so that coercion to
    decimal from double is guaranteed not lose information.

(On behalf of Schema WG)

Received on Friday, 13 May 2005 17:23:28 UTC