Re: largest finite float

Hi Dave,

     A small quibble:  the greatest, finite float value really is 2**128 - 
2**104.  Although IEEE 754 allows only 23 bits for the significand, one 
bit for the sign, and eight for the exponent, all normalized values have 
an implied MSB with a value of one.  That means the significand (m) runs 
from 2**23 to 2**24-1.  Hence, the greatest, finite value really is 
(2**24-1)*2**104 or 2**128-2**104.

     You also wrote that:
[[
>     I just noticed another problem with the specification of how 
>lexical values map to float values in the value space.  The text 
>cited above assumes that, of two consecutive normalized values in 
>the value space, one will be even and the other will be odd.

Yet another example of not-quite-precise writing, I'm afraid.  We copied
from Clinger who copied from (I assume) 754.  Something got lost in
translation.  (Ever play the gossip game?)

Note how each representable number has precisely one representation.
(I'll leave it to you to check the details about subnormalized numbers--
what I'm about to say is for positive normalized.)

One representable number and the next each have a particular m value;
usually with the same e, but at the boundary one m is all 1 bits; the
next up (larger e) is all 0 bits except the top-most bit.  In the first
case, the higher m is 1 more than the lower, so one number's m is odd
and the other's is even; in the second case, the lower is necessarily
odd and the next is even.  It's the even-ness of the integer m (the
"radicand", incorrectly called the "mantissa") that decides, not that
of the number represented.
]]

     I agree that IEEE 754 allows for only one representation of each 
finite value - it's true of both normalized and subnormalized values.  My 
concern was with 3.2.4 of Datatypes, which states that the value space 
consists of the values "m*2^e, where m is an integer whose absolute value 
is less than 2^24, and e is an integer between -149 and 104."  The 
existing description allows for more than one way of expressing most 
values, and hence it would be impossible to use the fact that m is odd or 
even to determine the direction for rounding with the existing description 
of the value space.  A description that limited m to the range 
[2**23,2**24-1], as you've described below, is necessary.

     As long as we're on the topic of subnormalized numbers, it's not 
clear to me whether they were intended to be part of the value space.  The 
description of the value space I've quoted above admits the subnormalized 
values to the value space - they are those values for which e=-149 and 0 < 
m < 2**23.  However, literals in the lexical space map to the closest 
*normalized* value.  That would mean that there are values in the value 
space to which no value in the lexical space will map.  Any revision of 
the description of float needs to answer this question.

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 969-6044;  Phone (905) 413-6044
mailto:zongaro@ca.ibm.com

To:     Henry Zongaro/Toronto/IBM@IBMCA, www-xml-schema-comments@w3.org
cc:     cmsmcq@acm.org, ashokma@microsoft.com, Paul.V.Biron@kp.org 
Subject:        Re: largest finite float


At 4:58 PM -0500 1/23/02, zongaro@ca.ibm.com wrote:
>C.M. Sperberg-McQueen wrote:
>[[
>The largest finite float, if I understand the notes correctly, is
>
>    m * 2**e
>
>where ** means exponentiation,
>        m is the largest number representable in the mantissa, and
>        e is the largest number representable as an exponent

[TERMINOLOGY NOTE:  m in this representation is *not* properly called the
mantissa.  "Significand" is better.]

>Since we have
>
>   m = 2 ** 24 - 1
>   e = 127
>
>it follows that
>
>   m * 2**e = (2**24) * (2**127) - 2**127
>            = 2**151 - 2**127
>
>]]
>
>A small correction here - the maximum value of e in that formula is 
>actually 104 (the minimum value is -149), so I believe the largest 
>finite float value is 2^128 - 2^104, which is approximately 
>3.4028x10^38.  (I'll refer to it as M below.)

Almost.  Depending on how old that paper of mine was--there was a time 
when
I was confused about how 754 approached this matter.

     o           For float, one uses 8 bits for e and 24 for m.

     o           For "normalized" numbers, the top bit of m is on.

     o           For normalized *positive* numbers, (since one bit is m's 
sign bit,
                 there are 23 left), this means

                 o    -127 <= e <= 127

                      (All bits on, -128, is reserved for signalling NaNs 
and
                      infinities.)

                 o    2*22 <= m <= 2*23 - 1

     o           For e = 0, this would result in normalized positive 
numbers of the
                 form m * 2**e running from 2**22 to 2**23 - 1 .  Then 
varying e
                 equally on either side of 0 would bias things heavily in 
favor of
                 large numbers.

     o           What they want is, for exponent zero, to have numbers 
close to and
                 just less than 1.  This requires that you bias the 
exponent by -23.

     o           This means the number represented by m and e is (m * 2**e 
* 2**-23)

Therefore the largest number representable is

                 (2**23 - 1) * 2**127 * 2**-23

which is

                 (2**23 - 1) * 2**104,  AKA  (2**127 - 2**104)

Net result is that Michael (and undoubtably me, back then) didn't bias the
exponent--and Michael, me back then, and Henry all failed to account for
the sign bit.  I leave it to those with time and calculator packages to
work out the decimal representation.

Henry continued, again quoting Michael:
>[[
>Some things are, unfortunately, not so clear to me:
>
>   (a) what the next largest float would be if we had one
>   (b) where the watershed point is between infinity and 2.85...E45
>   (c) whether the negative numbers are exactly the same as these
>       plus a minus sign, or divergent in some way
>
>I don't believe there has been any confusion over the watershed
>between zero and the smallest representable float.
>
>Dave, if you can confirm that I have correctly interpreted your
>notes, I'd be grateful.  Ditto if anyone can shed light on questions
>(a), (b), (c) above.
>]]
>
>      Regarding question (a), I'm not sure if there is a sensible 
>answer to that.  To have a next larger finite float value, you'd 
>have to have either more bits in the mantissa or more bits in the 
>exponent - which you choose determines what would be the next larger 
>float value.

Because of the bias, if you add more bits to the m (sorry, but it ain't a
mantissa) you also increase the bias, no net gain.  To get larger numbers
you must increase the exponent's bits.  Therefore, the next larger number
would be

                 (2**22) * 2**(127 + 1) * 2**-23,  AKA  (2**127)

>[[
>A literal in the 
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-lexical-space>·lexical 
>space· representing a decimal number d maps to the normalized value 
>in the 
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-value-space>·value 
>space· of float that is closest to d in the sense defined by 
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#clinger1990>[Clinger, 
>WD (1990)]; if d is exactly halfway between two such values then the 
>even value is chosen.
>]]
>
>      A literal reading of that text would have any lexical value 
>representing a decimal number greater than M map to M, because 
>that's the closest normalized value in the value space.  So, for 
>instance, 1.0E+100000 would map to M.  I believe that behaviour 
>would be contrary to the expectations of most users.

Quite so.  If I understand 754 correctly, the round-off algorithm is first
described in an "arbitrary integer exponent" model (i.e., as large an
exponent as you need, for this case).  So, for a given number of bits for
the m, every scientific-decimal numeral maps to some number by the 
algorithm.
If that number requires a larger integer exponent than is available, then
it is forcibly mapped to infinity.  So just as the cutoff between the last
two representable numbers is half-way between the two, for rounding 
purposes,
the cutoff above the last representable number is half way between that
number and the "next" one that would be representable with a larger 
exponent.
For float, that means half way between 2**127 - 2**104 and 2**127.

>     I just noticed another problem with the specification of how 
>lexical values map to float values in the value space.  The text 
>cited above assumes that, of two consecutive normalized values in 
>the value space, one will be even and the other will be odd.

Yet another example of not-quite-precise writing, I'm afraid.  We copied
from Clinger who copied from (I assume) 754.  Something got lost in
translation.  (Ever play the gossip game?)

Note how each representable number has precisely one representation.
(I'll leave it to you to check the details about subnormalized numbers--
what I'm about to say is for positive normalized.)

One representable number and the next each have a particular m value;
usually with the same e, but at the boundary one m is all 1 bits; the
next up (larger e) is all 0 bits except the top-most bit.  In the first
case, the higher m is 1 more than the lower, so one number's m is odd
and the other's is even; in the second case, the lower is necessarily
odd and the next is even.  It's the even-ness of the integer m (the
"radicand", incorrectly called the "mantissa") that decides, not that
of the number represented.

Hope this all helps.   Michael, I'm too tired to check out negatives.
I'm pretty sure they're symmetric.  I can't imagine any reason they
wouldn't be.
-- 
Dave Peterson
SGMLWorks!

davep@acm.org

Received on Thursday, 24 January 2002 10:15:21 UTC