RE: largest finite float from Ashok Malhotra on 2002-01-23 (www-xml-schema-comments@w3.org from January to March 2002)

From: Ashok Malhotra <ashokma@microsoft.com>
Date: Wed, 23 Jan 2002 15:53:50 -0800
To: <zongaro@ca.ibm.com>, <www-xml-schema-comments@w3.org>
Cc: <cmsmcq@acm.org>, <davep@acm.org>, <Paul.V.Biron@kp.org>
Message-ID: <E5B814702B65CB4DA51644580E4853FB0148865B@red-msg-12.redmond.corp.microsoft.com>
Henry:

Thank you for helping out in this arcane area.  I'd like to make two
points:

 

Some months ago I wrote a program to print out the max/min float/double
values in the Java

classes (of course I cannot find this program now but it's easy enough
to write).  I got the same

values as Microsoft's internal language runtime produces.  These were:

 

Float max       3.4028235e+38

Float min      -3.4028235E+38

Double max     1.79769313486231570E+308

Double min     -1.79769313486231570E+308

 

These may be truncated but note that they are symmetrical.

 

I'm troubled that these are different from Michael's maximum float
value:

"approximately 2.85449517E+45, or exactly
2854495215270736301647340207211686556881387520.  With commas, that's
2,854,495,215,270,736,301,647,340,207,211,686,556,881,387,520."

 

 

If we are talking normalized numbers i.e. with a fixed number of bits in
the mantissa and exponent

then would not the largest float have all 1s in the mantissa and the
second largest float have all 1s

except for the least significant bit which would be a 0?

 

All the best, Ashok 
=========================================================== 
Ashok Malhotra              <mailto: ashokma@microsoft.com
<mailto:%20ashokma@microsoft.com> > 
Microsoft Corporation 



-----Original Message-----
From: zongaro@ca.ibm.com [mailto:zongaro@ca.ibm.com] 
Sent: Wednesday, January 23, 2002 1:58 PM
To: www-xml-schema-comments@w3.org
Cc: cmsmcq@acm.org; davep@acm.org; Ashok Malhotra; Paul.V.Biron@kp.org
Subject: Re: largest finite float

 


C.M. Sperberg-McQueen wrote: 
[[
The largest finite float, if I understand the notes correctly, is

   m * 2**e

where ** means exponentiation,
       m is the largest number representable in the mantissa, and
       e is the largest number representable as an exponent

Since we have

  m = 2 ** 24 - 1
  e = 127

it follows that

  m * 2**e = (2**24) * (2**127) - 2**127
           = 2**151 - 2**127

]] 

A small correction here - the maximum value of e in that formula is
actually 104 (the minimum value is -149), so I believe the largest
finite float value is 2^128 - 2^104, which is approximately
3.4028x10^38.  (I'll refer to it as M below.) 

<<Decimal representation snipped>> 

[[ 
Some things are, unfortunately, not so clear to me:

  (a) what the next largest float would be if we had one
  (b) where the watershed point is between infinity and 2.85...E45
  (c) whether the negative numbers are exactly the same as these
      plus a minus sign, or divergent in some way

I don't believe there has been any confusion over the watershed
between zero and the smallest representable float.

Dave, if you can confirm that I have correctly interpreted your
notes, I'd be grateful.  Ditto if anyone can shed light on questions
(a), (b), (c) above.
]]

     Regarding question (a), I'm not sure if there is a sensible answer
to that.  To have a next larger finite float value, you'd have to have
either more bits in the mantissa or more bits in the exponent - which
you choose determines what would be the next larger float value. 

     By "the watershed point", I assume you're asking at what point do
values in the lexical space map to infinity rather than mapping to the
largest finite float value.  I'm not sure whether there was any
discussion of this after May, 2001 - or perhaps you're picking up the
thread - but at that point the question was unresolved.  Section 3.2.4
of the Datatypes recommendation [1] states (in part) the following. 

[[ 
A literal in the *lexical space*
<http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-lexical-space>
representing a decimal number d maps to the normalized value in the
*value space*
<http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-value-space>  of
float that is closest to d in the sense defined by [Clinger, WD (1990)]
<http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#clinger1990> ; if d
is exactly halfway between two such values then the even value is
chosen. 
]]

     A literal reading of that text would have any lexical value
representing a decimal number greater than M map to M, because that's
the closest normalized value in the value space.  So, for instance,
1.0E+100000 would map to M.  I believe that behaviour would be contrary
to the expectations of most users. 

     Clinger [2] leaves the behaviour for overflow up to "the policies
that have been established for handling overflow and underflow within
the particular floating point number system in question."  I think the
most reasonable behaviour would be that literals that represent decimal
numbers in the range [M,M+2^103) would map to the value M, and that
literals that represent decimal numbers greater than or equal to M+2^103
would map to positive infinity. 


    I just noticed another problem with the specification of how lexical
values map to float values in the value space.  The text cited above
assumes that, of two consecutive normalized values in the value space,
one will be even and the other will be odd.  Consider the following
decimal values:  16777216, 16777217 and 16777218.  The first is 2^24,
the second is 2^24+1, the third is 2^24+2.  However, only 2^24 (m=2^23,
e=1) and 2^24+2 (m=2^23+1, e=1) are values in the value space of float.
The value 16777217 is exactly halfway between the other two values, but
both those values are even.  To which value in the value space does
16777217 map?  Indeed, all finite values in the value space that are
greater than 2^24 are even! 

    Here are some other consecutive values to consider:  4194303.75,
4194303.5, 4194303.25, 4194303.  Which of each consecutive pair could be
considered to be even? 

    It could be that it was intended that such a decimal value map to
the float value whose value of "m" is even (in the formula m*2^e).  If
that's the case, the definition of m and e will need to be made more
precise, because many values can be expressed in more than one way:  for
example, the value 4 could be expressed as having m=4 and e=0 (4x2^2),
or m=4 and e=1 (2x2^1). 

Thanks,

Henry 
[1] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#float 
[2] <ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps>
ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps
------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 969-6044;  Phone (905) 413-6044
mailto:zongaro@ca.ibm.com
Received on Wednesday, 23 January 2002 18:54:23 UTC