Re: largest finite float from Dave Peterson on 2002-01-24 (www-xml-schema-comments@w3.org from January to March 2002)

From: Dave Peterson <davep@acm.org>
Date: Wed, 23 Jan 2002 22:47:18 -0500
To: zongaro@ca.ibm.com, www-xml-schema-comments@w3.org
Cc: cmsmcq@acm.org, ashokma@microsoft.com, Paul.V.Biron@kp.org
Message-Id: <a05010402b87520608b1b@[209.6.249.111]>
At 4:58 PM -0500 1/23/02, zongaro@ca.ibm.com wrote:
>C.M. Sperberg-McQueen wrote:
>[[
>The largest finite float, if I understand the notes correctly, is
>
>    m * 2**e
>
>where ** means exponentiation,
>        m is the largest number representable in the mantissa, and
>        e is the largest number representable as an exponent

[TERMINOLOGY NOTE:  m in this representation is *not* properly called the
mantissa.  "Significand" is better.]

>Since we have
>
>   m = 2 ** 24 - 1
>   e = 127
>
>it follows that
>
>   m * 2**e = (2**24) * (2**127) - 2**127
>            = 2**151 - 2**127
>
>]]
>
>A small correction here - the maximum value of e in that formula is 
>actually 104 (the minimum value is -149), so I believe the largest 
>finite float value is 2^128 - 2^104, which is approximately 
>3.4028x10^38.  (I'll refer to it as M below.)

Almost.  Depending on how old that paper of mine was--there was a time when
I was confused about how 754 approached this matter.

     o	For float, one uses 8 bits for e and 24 for m.

     o	For "normalized" numbers, the top bit of m is on.

     o	For normalized *positive* numbers, (since one bit is m's sign bit,
	there are 23 left), this means

	o    -127 <= e <= 127

	     (All bits on, -128, is reserved for signalling NaNs and
	     infinities.)

	o    2*22 <= m <= 2*23 - 1

     o	For e = 0, this would result in normalized positive numbers of the
	form m * 2**e running from 2**22 to 2**23 - 1 .  Then varying e
	equally on either side of 0 would bias things heavily in favor of
	large numbers.

     o	What they want is, for exponent zero, to have numbers close to and
	just less than 1.  This requires that you bias the exponent by -23.

     o	This means the number represented by m and e is (m * 2**e * 2**-23)

Therefore the largest number representable is

	(2**23 - 1) * 2**127 * 2**-23

which is

	(2**23 - 1) * 2**104,  AKA  (2**127 - 2**104)

Net result is that Michael (and undoubtably me, back then) didn't bias the
exponent--and Michael, me back then, and Henry all failed to account for
the sign bit.  I leave it to those with time and calculator packages to
work out the decimal representation.

Henry continued, again quoting Michael:
>[[
>Some things are, unfortunately, not so clear to me:
>
>   (a) what the next largest float would be if we had one
>   (b) where the watershed point is between infinity and 2.85...E45
>   (c) whether the negative numbers are exactly the same as these
>       plus a minus sign, or divergent in some way
>
>I don't believe there has been any confusion over the watershed
>between zero and the smallest representable float.
>
>Dave, if you can confirm that I have correctly interpreted your
>notes, I'd be grateful.  Ditto if anyone can shed light on questions
>(a), (b), (c) above.
>]]
>
>      Regarding question (a), I'm not sure if there is a sensible 
>answer to that.  To have a next larger finite float value, you'd 
>have to have either more bits in the mantissa or more bits in the 
>exponent - which you choose determines what would be the next larger 
>float value.

Because of the bias, if you add more bits to the m (sorry, but it ain't a
mantissa) you also increase the bias, no net gain.  To get larger numbers
you must increase the exponent's bits.  Therefore, the next larger number
would be

	(2**22) * 2**(127 + 1) * 2**-23,  AKA  (2**127)

>[[
>A literal in the 
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-lexical-space>·lexical 
>space· representing a decimal number d maps to the normalized value 
>in the 
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-value-space>·value 
>space· of float that is closest to d in the sense defined by 
><http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#clinger1990>[Clinger, 
>WD (1990)]; if d is exactly halfway between two such values then the 
>even value is chosen.
>]]
>
>      A literal reading of that text would have any lexical value 
>representing a decimal number greater than M map to M, because 
>that's the closest normalized value in the value space.  So, for 
>instance, 1.0E+100000 would map to M.  I believe that behaviour 
>would be contrary to the expectations of most users.

Quite so.  If I understand 754 correctly, the round-off algorithm is first
described in an "arbitrary integer exponent" model (i.e., as large an
exponent as you need, for this case).  So, for a given number of bits for
the m, every scientific-decimal numeral maps to some number by the algorithm.
If that number requires a larger integer exponent than is available, then
it is forcibly mapped to infinity.  So just as the cutoff between the last
two representable numbers is half-way between the two, for rounding purposes,
the cutoff above the last representable number is half way between that
number and the "next" one that would be representable with a larger exponent.
For float, that means half way between 2**127 - 2**104 and 2**127.

>     I just noticed another problem with the specification of how 
>lexical values map to float values in the value space.  The text 
>cited above assumes that, of two consecutive normalized values in 
>the value space, one will be even and the other will be odd.

Yet another example of not-quite-precise writing, I'm afraid.  We copied
from Clinger who copied from (I assume) 754.  Something got lost in
translation.  (Ever play the gossip game?)

Note how each representable number has precisely one representation.
(I'll leave it to you to check the details about subnormalized numbers--
what I'm about to say is for positive normalized.)

One representable number and the next each have a particular m value;
usually with the same e, but at the boundary one m is all 1 bits; the
next up (larger e) is all 0 bits except the top-most bit.  In the first
case, the higher m is 1 more than the lower, so one number's m is odd
and the other's is even; in the second case, the lower is necessarily
odd and the next is even.  It's the even-ness of the integer m (the
"radicand", incorrectly called the "mantissa") that decides, not that
of the number represented.

Hope this all helps.   Michael, I'm too tired to check out negatives.
I'm pretty sure they're symmetric.  I can't imagine any reason they
wouldn't be.
-- 
Dave Peterson
SGMLWorks!

davep@acm.org
Received on Wednesday, 23 January 2002 22:49:06 UTC