Re: Primitive Datatypes of XML Schema (boolean, float, double) from Dan Connolly on 2000-02-14 (www-xml-schema-comments@w3.org from January to March 2000)

From: Dan Connolly <connolly@w3.org>
Date: Mon, 14 Feb 2000 12:06:27 -0600
To: www-xml-schema-comments@w3.org, "Arnold, Curt" <Curt.Arnold@hyprotech.com>
CC: Mark Reinhold <mr@eng.sun.com>
Message-ID: <38A84423.75FC5387@w3.org>
I read this and some of your earlier comments on this topic; e.g.:

>> Here is the pocket version of my feelings toward the 12/17 Datatypes Draft
>> 
>> 1) "real" (a unlimited range and precision floating point) must come back.
-- Sat, 15 Jan 2000 11:32:21 -0600
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0055.html

I'm not sure what to make of that. The decimal type provides
unlimited range and precision, so I'm not sure why you
say it 'must come back'. It never went anywhere.

I've never heard the term 'floating point'
used to describe a datatype with unlimited range/precision, so
I don't know what you mean for those words to contribute.

As to 'real'... the real numbers don't have a convenient lexical
representation
(e.g. sqrt(2) and pi are infinitely long non-repeating numerals).
'real' was (unfortuntately!) popularized as a name for floating-point
types in FORTRAN, but those were always fixed range/precision. What
did you mean by 'real'?

In your message of Thu, 13 Jan 2000 12:55:53 -0700
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0050.html
you write:

>> In my perfect would "real" would be primitive, "decimal" would be derived
>> from real (simply excludes the E+nnn fragment) and "integer" would derive
>> from "decimal".

Well... integer does derive from decimal. We don't have an
arbitrary-precision
datatype with the E+nnn idiom in its lexical representation; is
that what you meant by "real"? If you can show, by way of use cases,
that it's important/essential to be able to use the E+nnn idiom when
writing decimals, I suppose we could consider it. But writing lots
of zeros doesn't seem to be a critical problem, as far as I can see.

Your message goes on to say...

>> If you really wanted to conform to the behavior of a specific floating point
>> system (which I would discourage),

it was considered essential by the WG. c.f. 

>>  you can do that with existing facets

> The elimination of real and the introduction of float and double types in the last draft do several very negative things:
> 
> 1) they make it very, very difficult to write applications that generate valid XML on platforms whose native floating points are not IEEE when there are minExclusive or maxExclusive constraints.
> Basically, you have to try to mimic the rounding characteristics of IEEE to make sure that a value that is less than or greater than the bound on your platform is still less than or greater than after
> rounding on IEEE.

We considered more generalized designs, including the HTTP-NG floating
point design:

======
excerpt from
http://www.w3.org/TR/1998/WD-HTTP-NG-architecture-19980710/

3.5.2. Floating-point Types 

Floating-point types are specified with eight parameters: 

     the size in bits of the significand, 
     the base of the exponent, 
     the maximum exponent value, 
     the minimum exponent value, 
     whether they support a distinguished value for `Not-A-Number', 
     whether they support a distinguished value for `Infinity', 
     whether denormalized values are allowed, and 
     whether the zero value is signed (whether they can have both +0 and
-0). 
======

but we decided that the burden on implementors outweighed the benefits.

We decided the cost of implementing the IEEE semantics on platforms
where you can't count on the infrastructure to provide IEEE semantics
was acceptable. We expect it's more cost-effective for everybody
to converge on one floating point platform than to proliferate
a variety of them.

> 
> 2) it makes it very difficult to write validating parsers in languages that do not impose IEEE numerics (i.e. C++) than will validate consistently on different platforms.  Basically, it means that you
> have to write your own atof() and floating point comparision routines (on top of long or quadword types) since you cannot depend on the native float and double to behave consistently with IEEE.
> 
> 3) it requires conversion from text to a float/double type for validation.  With the abstract real type type, you could do constraint checking lexically which should be substantially faster than
> conversion to a floating point and then comparision.

What do you mean by an 'abstract real type'? The way it was specified
in earlier drafts didn't make sense to the WG, upon examination.

>  Numeric conversion can very easily dwarf both parsing and DOM creation in time.  I've been meaning to develop and publish some benchmarks for
> this.

If you don't expect the receiver to convert to an IEEE floating point
representation, I suggest you use the decimal type (or some type
derived from it) rather than float/double.


> 4) It doesn't support more precise numeric representations.
> 
> There have been several threads on floating point issues on the schema comments list, the last significant thread was http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0043.html
> 
> The following messages consider the deleted minAbsoluteValue facet which was a the first move toward binding real to a specific implementation.
> 
> http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0024.html
> http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999JulSep/0052.html


-- 
Dan Connolly, W3C
http://www.w3.org/People/Connolly/
Received on Monday, 14 February 2000 13:06:38 UTC