- From: <zongaro@ca.ibm.com>
- Date: Wed, 23 Jan 2002 16:58:04 -0500
- To: www-xml-schema-comments@w3.org
- Cc: cmsmcq@acm.org, davep@acm.org, ashokma@microsoft.com, Paul.V.Biron@kp.org
- Message-ID: <OF534C4F5B.66D85E07-ON85256B4A.00642AFD@torolab.ibm.com>
C.M. Sperberg-McQueen wrote:
[[
The largest finite float, if I understand the notes correctly, is
m * 2**e
where ** means exponentiation,
m is the largest number representable in the mantissa, and
e is the largest number representable as an exponent
Since we have
m = 2 ** 24 - 1
e = 127
it follows that
m * 2**e = (2**24) * (2**127) - 2**127
= 2**151 - 2**127
]]
A small correction here - the maximum value of e in that formula is
actually 104 (the minimum value is -149), so I believe the largest finite
float value is 2^128 - 2^104, which is approximately 3.4028x10^38. (I'll
refer to it as M below.)
<<Decimal representation snipped>>
[[
Some things are, unfortunately, not so clear to me:
(a) what the next largest float would be if we had one
(b) where the watershed point is between infinity and 2.85...E45
(c) whether the negative numbers are exactly the same as these
plus a minus sign, or divergent in some way
I don't believe there has been any confusion over the watershed
between zero and the smallest representable float.
Dave, if you can confirm that I have correctly interpreted your
notes, I'd be grateful. Ditto if anyone can shed light on questions
(a), (b), (c) above.
]]
Regarding question (a), I'm not sure if there is a sensible answer to
that. To have a next larger finite float value, you'd have to have either
more bits in the mantissa or more bits in the exponent - which you choose
determines what would be the next larger float value.
By "the watershed point", I assume you're asking at what point do
values in the lexical space map to infinity rather than mapping to the
largest finite float value. I'm not sure whether there was any discussion
of this after May, 2001 - or perhaps you're picking up the thread - but at
that point the question was unresolved. Section 3.2.4 of the Datatypes
recommendation [1] states (in part) the following.
[[
A literal in the ·lexical space· representing a decimal number d maps to the normalized value in the ·value space· of float that is closest to d in the sense defined by [Clinger, WD (1990)]; if d is exactly halfway between two such values then the even value is chosen.
]]
A literal reading of that text would have any lexical value
representing a decimal number greater than M map to M, because that's the
closest normalized value in the value space. So, for instance,
1.0E+100000 would map to M. I believe that behaviour would be contrary to
the expectations of most users.
Clinger [2] leaves the behaviour for overflow up to "the policies
that have been established for handling overflow and underflow within the
particular floating point number system in question." I think the most
reasonable behaviour would be that literals that represent decimal numbers
in the range [M,M+2^103) would map to the value M, and that literals that
represent decimal numbers greater than or equal to M+2^103 would map to
positive infinity.
I just noticed another problem with the specification of how lexical
values map to float values in the value space. The text cited above
assumes that, of two consecutive normalized values in the value space, one
will be even and the other will be odd. Consider the following decimal
values: 16777216, 16777217 and 16777218. The first is 2^24, the second is 2^24+1,
the third is 2^24+2. However, only 2^24 (m=2^23, e=1) and 2^24+2
(m=2^23+1, e=1) are values in the value space of float. The value
16777217 is exactly halfway between the other two values, but both those
values are even. To which value in the value space does 16777217 map?
Indeed, all finite values in the value space that are greater than 2^24
are even!
Here are some other consecutive values to consider: 4194303.75, 4194303.5, 4194303.25, 4194303. Which of each consecutive
pair could be considered to be even?
It could be that it was intended that such a decimal value map to the
float value whose value of "m" is even (in the formula m*2^e). If that's
the case, the definition of m and e will need to be made more precise,
because many values can be expressed in more than one way: for example,
the value 4 could be expressed as having m=4 and e=0 (4x2^2), or m=4 and
e=1 (2x2^1).
Thanks,
Henry
[1] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#float
[2] ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps
------------------------------------------------------------------
Henry Zongaro XML Parsers development
IBM SWS Toronto Lab Tie Line 969-6044; Phone (905) 413-6044
mailto:zongaro@ca.ibm.com
Received on Wednesday, 23 January 2002 16:58:48 UTC