Re: largest finite float from zongaro@ca.ibm.com on 2002-01-23 (www-xml-schema-comments@w3.org from January to March 2002)

From: <zongaro@ca.ibm.com>
Date: Wed, 23 Jan 2002 16:58:04 -0500
To: www-xml-schema-comments@w3.org
Cc: cmsmcq@acm.org, davep@acm.org, ashokma@microsoft.com, Paul.V.Biron@kp.org
Message-ID: <OF534C4F5B.66D85E07-ON85256B4A.00642AFD@torolab.ibm.com>
C.M. Sperberg-McQueen wrote:
[[
The largest finite float, if I understand the notes correctly, is

    m * 2**e

where ** means exponentiation,
        m is the largest number representable in the mantissa, and
        e is the largest number representable as an exponent

Since we have

   m = 2 ** 24 - 1
   e = 127

it follows that

   m * 2**e = (2**24) * (2**127) - 2**127
            = 2**151 - 2**127

]]

A small correction here - the maximum value of e in that formula is 
actually 104 (the minimum value is -149), so I believe the largest finite 
float value is 2^128 - 2^104, which is approximately 3.4028x10^38.  (I'll 
refer to it as M below.)

<<Decimal representation snipped>>

[[
Some things are, unfortunately, not so clear to me:

   (a) what the next largest float would be if we had one
   (b) where the watershed point is between infinity and 2.85...E45
   (c) whether the negative numbers are exactly the same as these
       plus a minus sign, or divergent in some way

I don't believe there has been any confusion over the watershed
between zero and the smallest representable float.

Dave, if you can confirm that I have correctly interpreted your
notes, I'd be grateful.  Ditto if anyone can shed light on questions
(a), (b), (c) above.
]]

     Regarding question (a), I'm not sure if there is a sensible answer to 
that.  To have a next larger finite float value, you'd have to have either 
more bits in the mantissa or more bits in the exponent - which you choose 
determines what would be the next larger float value.

     By "the watershed point", I assume you're asking at what point do 
values in the lexical space map to infinity rather than mapping to the 
largest finite float value.  I'm not sure whether there was any discussion 
of this after May, 2001 - or perhaps you're picking up the thread - but at 
that point the question was unresolved.  Section 3.2.4 of the Datatypes 
recommendation [1] states (in part) the following.

[[
A literal in the ·lexical space· representing a decimal number d maps to the normalized value in the ·value space· of float that is closest to d in the sense defined by [Clinger, WD (1990)]; if d is exactly halfway between two such values then the even value is chosen. 

]]

     A literal reading of that text would have any lexical value 
representing a decimal number greater than M map to M, because that's the 
closest normalized value in the value space.  So, for instance, 
1.0E+100000 would map to M.  I believe that behaviour would be contrary to 
the expectations of most users.

     Clinger [2] leaves the behaviour for overflow up to "the policies 
that have been established for handling overflow and underflow within the 
particular floating point number system in question."  I think the most 
reasonable behaviour would be that literals that represent decimal numbers 
in the range [M,M+2^103) would map to the value M, and that literals that 
represent decimal numbers greater than or equal to M+2^103 would map to 
positive infinity.


    I just noticed another problem with the specification of how lexical 
values map to float values in the value space.  The text cited above 
assumes that, of two consecutive normalized values in the value space, one 
will be even and the other will be odd.  Consider the following decimal 
values:  16777216, 16777217 and 16777218.  The first is 2^24, the second is 2^24+1, 
the third is 2^24+2.  However, only 2^24 (m=2^23, e=1) and 2^24+2 
(m=2^23+1, e=1) are values in the value space of float.  The value 
16777217 is exactly halfway between the other two values, but both those 
values are even.  To which value in the value space does 16777217 map? 
Indeed, all finite values in the value space that are greater than 2^24 
are even!

    Here are some other consecutive values to consider:  4194303.75, 4194303.5, 4194303.25, 4194303.  Which of each consecutive 
pair could be considered to be even?

    It could be that it was intended that such a decimal value map to the 
float value whose value of "m" is even (in the formula m*2^e).  If that's 
the case, the definition of m and e will need to be made more precise, 
because many values can be expressed in more than one way:  for example, 
the value 4 could be expressed as having m=4 and e=0 (4x2^2), or m=4 and 
e=1 (2x2^1).

Thanks,

Henry
[1] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#float
[2] ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps
------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 969-6044;  Phone (905) 413-6044
mailto:zongaro@ca.ibm.com
Received on Wednesday, 23 January 2002 16:58:48 UTC