LC-7. cotton-on-decimal: Arbitrary-precision decimal too much?

The W3C XML Schema Working Group has spent the last several months
working through the comments received from the public on the last-call
draft of the XML Schema specification.  We thank you for the comments
you made on our specification during our last-call comment period, and
want to make sure you know that all comments received during the
last-call comment period have been recorded in our last-call issues
list (http://www.w3.org/2000/05/12-xmlschema-lcissues).

Among other issues, the Query Working Group raised the following issue:

 > Is the requirement that XML Schema implementations must support
 > arbitrary-precision decimals an excessive burden on implementors of
 > a query language? Should XML Schema instead specify that the maximum
 > precision for decimal numbers should be an "implementation-defined
 > number not less than X", with the value of X to be determined?

The Schema Working Group proposes the following resolution:

<resolution>
1. The XML Schema spec set down the minimum mumber of digits that must be 
supported by a conforming XML processor for the numeric datatypes integer, 
decimal, float and double.
2. XML processors are free to support more than the minimum number of 
digits. If they do so, they should adverstise this fact as part of their 
specifications.
3. The minimum number of digits required for (1) should be derived based on 
the number of digits supported by some standard programming languages such 
as C and Java. These are discussed below. In earlier notes I had proposed 
that the precisions be based on the number of digits supported by 32-bit 
processors but I realized that languages often use multiple words to store 
numeric values.

Also, as most processors will translate values encoded in XML documents 
into values in some programming language, it seems more sensible to base 
precisions on those supported by common programming languages.

SUGGESTED PRECISIONS

The Java Language Specification (Gosling, Joy, Steele) says
·       For int, from -2147483648 to 2147483647 i.e 9 significant digits
·       For long, from -9223372036854775808 to 9223372036854775808 i.e. 18 
significant digits
·       Decimals can have the same range of values i.e the same number of 
digits.
·       For float the values are of the form +/- m*2**e where m is a 
positive integer less than 2**24 i.e 16777216 and e is between -149 and 
104. This yields 7 significant digits for the mantissa and 2 digits for the 
exponent. For double, m is a positive integer less than 2**53 i.e 
9007199254740992 and e is an integer between -1075 and 970. This yields 15 
significant digits for the mantissa and still 2 digits for the exponent.

C allows compilers to chose how many digits to use for int and long. The 
limits.h library defines the minimum and maximum values for long consistent 
with the values for Java int above.
For floating point numbers float.h defines precisions tha are significantly 
lower: for float, 6 digits of precision in the mantissa and a maximum of 
+/- 37 for the exponent. For double, 10 digits of precision and still +/- 
37 as a maximum for the exponent.
I do not understand the lower precision for floating point numbers in C. 
Perhaps this is because float.h also allows you to specify the radix for 
float and double or merely that I used Kernighan and Richie and newer 
compilers allow more digits.

RECOMMENDATION

If we set a single minimum standard then, based on the Java figures above, 
I would recommend 18 digits for integers and decimals. For float and double 
I would recommend 15 digits for the mantissa and 2 digits for the exponent.
If these figures are felt to be too generous we could go with a 2-tier system.
</resolution>

It would be helpful to us to know whether you are satisfied with the
decision taken by the WG on this issue, or wish your dissent from the
WG's decision to be recorded for consideration by the Director of
the W3C.

Thanks!

Jonathan

Received on Wednesday, 20 September 2000 09:21:57 UTC