# Re: largest finite float

From: <zongaro@ca.ibm.com>
Date: Thu, 24 Jan 2002 10:15:01 -0500
To: Dave Peterson <davep@acm.org>

Message-ID: <OF7FAB40E4.FE7B8604-ON85256B4B.004E9C73@torolab.ibm.com>
```Hi Dave,

A small quibble:  the greatest, finite float value really is 2**128 -
2**104.  Although IEEE 754 allows only 23 bits for the significand, one
bit for the sign, and eight for the exponent, all normalized values have
an implied MSB with a value of one.  That means the significand (m) runs
from 2**23 to 2**24-1.  Hence, the greatest, finite value really is
(2**24-1)*2**104 or 2**128-2**104.

You also wrote that:
[[
>     I just noticed another problem with the specification of how
>lexical values map to float values in the value space.  The text
>cited above assumes that, of two consecutive normalized values in
>the value space, one will be even and the other will be odd.

Yet another example of not-quite-precise writing, I'm afraid.  We copied
from Clinger who copied from (I assume) 754.  Something got lost in
translation.  (Ever play the gossip game?)

Note how each representable number has precisely one representation.
(I'll leave it to you to check the details about subnormalized numbers--
what I'm about to say is for positive normalized.)

One representable number and the next each have a particular m value;
usually with the same e, but at the boundary one m is all 1 bits; the
next up (larger e) is all 0 bits except the top-most bit.  In the first
case, the higher m is 1 more than the lower, so one number's m is odd
and the other's is even; in the second case, the lower is necessarily
odd and the next is even.  It's the even-ness of the integer m (the
"radicand", incorrectly called the "mantissa") that decides, not that
of the number represented.
]]

I agree that IEEE 754 allows for only one representation of each
finite value - it's true of both normalized and subnormalized values.  My
concern was with 3.2.4 of Datatypes, which states that the value space
consists of the values "m*2^e, where m is an integer whose absolute value
is less than 2^24, and e is an integer between -149 and 104."  The
existing description allows for more than one way of expressing most
values, and hence it would be impossible to use the fact that m is odd or
even to determine the direction for rounding with the existing description
of the value space.  A description that limited m to the range
[2**23,2**24-1], as you've described below, is necessary.

As long as we're on the topic of subnormalized numbers, it's not
clear to me whether they were intended to be part of the value space.  The
description of the value space I've quoted above admits the subnormalized
values to the value space - they are those values for which e=-149 and 0 <
m < 2**23.  However, literals in the lexical space map to the closest
*normalized* value.  That would mean that there are values in the value
space to which no value in the lexical space will map.  Any revision of
the description of float needs to answer this question.

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 969-6044;  Phone (905) 413-6044
mailto:zongaro@ca.ibm.com

To:     Henry Zongaro/Toronto/IBM@IBMCA, www-xml-schema-comments@w3.org
cc:     cmsmcq@acm.org, ashokma@microsoft.com, Paul.V.Biron@kp.org
Subject:        Re: largest finite float

At 4:58 PM -0500 1/23/02, zongaro@ca.ibm.com wrote:
>C.M. Sperberg-McQueen wrote:
>[[
>The largest finite float, if I understand the notes correctly, is
>
>    m * 2**e
>
>where ** means exponentiation,
>        m is the largest number representable in the mantissa, and
>        e is the largest number representable as an exponent

[TERMINOLOGY NOTE:  m in this representation is *not* properly called the
mantissa.  "Significand" is better.]

>Since we have
>
>   m = 2 ** 24 - 1
>   e = 127
>
>it follows that
>
>   m * 2**e = (2**24) * (2**127) - 2**127
>            = 2**151 - 2**127
>
>]]
>
>A small correction here - the maximum value of e in that formula is
>actually 104 (the minimum value is -149), so I believe the largest
>finite float value is 2^128 - 2^104, which is approximately
>3.4028x10^38.  (I'll refer to it as M below.)

Almost.  Depending on how old that paper of mine was--there was a time
when
I was confused about how 754 approached this matter.

o           For float, one uses 8 bits for e and 24 for m.

o           For "normalized" numbers, the top bit of m is on.

o           For normalized *positive* numbers, (since one bit is m's
sign bit,
there are 23 left), this means

o    -127 <= e <= 127

(All bits on, -128, is reserved for signalling NaNs
and
infinities.)

o    2*22 <= m <= 2*23 - 1

o           For e = 0, this would result in normalized positive
numbers of the
form m * 2**e running from 2**22 to 2**23 - 1 .  Then
varying e
equally on either side of 0 would bias things heavily in
favor of
large numbers.

o           What they want is, for exponent zero, to have numbers
close to and
just less than 1.  This requires that you bias the
exponent by -23.

o           This means the number represented by m and e is (m * 2**e
* 2**-23)

Therefore the largest number representable is

(2**23 - 1) * 2**127 * 2**-23

which is

(2**23 - 1) * 2**104,  AKA  (2**127 - 2**104)

Net result is that Michael (and undoubtably me, back then) didn't bias the
exponent--and Michael, me back then, and Henry all failed to account for
the sign bit.  I leave it to those with time and calculator packages to
work out the decimal representation.

Henry continued, again quoting Michael:
>[[
>Some things are, unfortunately, not so clear to me:
>
>   (a) what the next largest float would be if we had one
>   (b) where the watershed point is between infinity and 2.85...E45
>   (c) whether the negative numbers are exactly the same as these
>       plus a minus sign, or divergent in some way
>
>I don't believe there has been any confusion over the watershed
>between zero and the smallest representable float.
>
>Dave, if you can confirm that I have correctly interpreted your
>notes, I'd be grateful.  Ditto if anyone can shed light on questions
>(a), (b), (c) above.
>]]
>
>      Regarding question (a), I'm not sure if there is a sensible
>answer to that.  To have a next larger finite float value, you'd
>have to have either more bits in the mantissa or more bits in the
>exponent - which you choose determines what would be the next larger
>float value.

Because of the bias, if you add more bits to the m (sorry, but it ain't a
mantissa) you also increase the bias, no net gain.  To get larger numbers
you must increase the exponent's bits.  Therefore, the next larger number
would be

(2**22) * 2**(127 + 1) * 2**-23,  AKA  (2**127)

>[[
>A literal in the
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-lexical-space>·lexical
>space· representing a decimal number d maps to the normalized value
>in the
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-value-space>·value
>space· of float that is closest to d in the sense defined by
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#clinger1990>[Clinger,
>WD (1990)]; if d is exactly halfway between two such values then the
>even value is chosen.
>]]
>
>      A literal reading of that text would have any lexical value
>representing a decimal number greater than M map to M, because
>that's the closest normalized value in the value space.  So, for
>instance, 1.0E+100000 would map to M.  I believe that behaviour
>would be contrary to the expectations of most users.

Quite so.  If I understand 754 correctly, the round-off algorithm is first
described in an "arbitrary integer exponent" model (i.e., as large an
exponent as you need, for this case).  So, for a given number of bits for
the m, every scientific-decimal numeral maps to some number by the
algorithm.
If that number requires a larger integer exponent than is available, then
it is forcibly mapped to infinity.  So just as the cutoff between the last
two representable numbers is half-way between the two, for rounding
purposes,
the cutoff above the last representable number is half way between that
number and the "next" one that would be representable with a larger
exponent.
For float, that means half way between 2**127 - 2**104 and 2**127.

>     I just noticed another problem with the specification of how
>lexical values map to float values in the value space.  The text
>cited above assumes that, of two consecutive normalized values in
>the value space, one will be even and the other will be odd.

Yet another example of not-quite-precise writing, I'm afraid.  We copied
from Clinger who copied from (I assume) 754.  Something got lost in
translation.  (Ever play the gossip game?)

Note how each representable number has precisely one representation.
(I'll leave it to you to check the details about subnormalized numbers--
what I'm about to say is for positive normalized.)

One representable number and the next each have a particular m value;
usually with the same e, but at the boundary one m is all 1 bits; the
next up (larger e) is all 0 bits except the top-most bit.  In the first
case, the higher m is 1 more than the lower, so one number's m is odd
and the other's is even; in the second case, the lower is necessarily
odd and the next is even.  It's the even-ness of the integer m (the
"radicand", incorrectly called the "mantissa") that decides, not that
of the number represented.

Hope this all helps.   Michael, I'm too tired to check out negatives.
I'm pretty sure they're symmetric.  I can't imagine any reason they
wouldn't be.
--
Dave Peterson
SGMLWorks!

davep@acm.org
```
Received on Thursday, 24 January 2002 10:15:21 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:49:59 UTC