- From: Arnold, Curt <Curt.Arnold@hyprotech.com>
- Date: Mon, 14 Feb 2000 12:00:11 -0700
- To: "'Dan Connolly'" <connolly@w3.org>
- Cc: "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org>
Using decimal as a fallback position is possible, however loss of the E+/- idiom does result in a loss of legibility when dealing with large or small numbers. It is easier for a human to comprehend, compare or detect an error when a number is represented as 6.023E23 instead of 602300000000000000000000. If you are using long double's, you could be stuck with using almost 5000 characters to represent 19 digits of precision. When I was using "real", I was using it in the sense of the earlier drafts. That is basically decimal with the E+/- idiom. I agree that trying to support a wide variety of floating point platforms is not desireable. Making all platforms try to understand all other numeric platforms is definitely more challenging than to make all platform try to understand just one numeric format. It would be likely that any non-IEEE system would be exchanging information with an IEEE system, so it is easier to force the non-IEEE system to mimic IEEE semantics. I don't have a problem with double and float being defined datatypes and I agree the ranges and rounding behavior can't be precisely replicated lexically. I would just like this to be in addition to an "real" datatype. However, when the schema author does not wish to constrain the schema to IEEE float or double ranges and does not want to take the performance hit to precisely replicate their rounding behavior on evaluating min/max constraints, then he shouldn't be forced to take that hit. I need to build some benchmarks on the relative speed of lexical value comparision vs conversion to double/float and comparision, but conversion to double/float is typically a very expensive operation and could severely impact the performance of applications dealing with a lot of numeric data. -----Original Message----- From: Dan Connolly [mailto:connolly@w3.org] Sent: Monday, February 14, 2000 12:10 PM To: www-xml-schema-comments@w3.org; Arnold, Curt Cc: Mark Reinhold Subject: Re: Primitive Datatypes of XML Schema (boolean, float, double) [oops... I accidently sent an unfinished version of this message... please disregard my message of Mon, 14 Feb 2000 12:06:27 -0600] I read this and some of your earlier comments on this topic; e.g.: >> Here is the pocket version of my feelings toward the 12/17 Datatypes Draft >> >> 1) "real" (a unlimited range and precision floating point) must come back. -- Sat, 15 Jan 2000 11:32:21 -0600 http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0055.html I'm not sure what to make of that. The decimal type provides unlimited range and precision, so I'm not sure why you say it 'must come back'. It never went anywhere. I've never heard the term 'floating point' used to describe a datatype with unlimited range/precision, so I don't know what you mean for those words to contribute. As to 'real'... the real numbers don't have a convenient lexical representation (e.g. sqrt(2) and pi are infinitely long non-repeating numerals). 'real' was (unfortuntately!) popularized as a name for floating-point types in FORTRAN, but those were always fixed range/precision. What did you mean by 'real'? In your message of Thu, 13 Jan 2000 12:55:53 -0700 http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0050.html you write: >> In my perfect would "real" would be primitive, "decimal" would be derived >> from real (simply excludes the E+nnn fragment) and "integer" would derive >> from "decimal". Well... integer does derive from decimal. We don't have an arbitrary-precision datatype with the E+nnn idiom in its lexical representation; is that what you meant by "real"? If you can show, by way of use cases, that it's important/essential to be able to use the E+nnn idiom when writing decimals, I suppose we could consider it. But writing lots of zeros doesn't seem to be a critical problem, as far as I can see. Your message goes on to say... >> If you really wanted to conform to the behavior of a specific floating point >> system (which I would discourage), it was considered essential by the WG. c.f. scenarios "3.Supervisory control and data acquisition. " and "6.Open and uniform transfer of data between applications, including databases " in our requirements document http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215 >> you can do that with existing facets no, you cannot specify IEEE float/double semantics in terms of arbitrary precision decimals (or rationals in any radix). Floating point comparison (and other operations) are just not the same as rational comparisons. I'll try to dig up details on this... > The elimination of real and the introduction of float and double types in the last draft do several very negative things: > > 1) they make it very, very difficult to write applications that generate valid XML on platforms whose native floating points are not IEEE when there are minExclusive or maxExclusive constraints. > Basically, you have to try to mimic the rounding characteristics of IEEE to make sure that a value that is less than or greater than the bound on your platform is still less than or greater than after > rounding on IEEE. We considered more generalized designs, including the HTTP-NG floating point design: ====== excerpt from http://www.w3.org/TR/1998/WD-HTTP-NG-architecture-19980710/ 3.5.2. Floating-point Types Floating-point types are specified with eight parameters: the size in bits of the significand, the base of the exponent, the maximum exponent value, the minimum exponent value, whether they support a distinguished value for `Not-A-Number', whether they support a distinguished value for `Infinity', whether denormalized values are allowed, and whether the zero value is signed (whether they can have both +0 and -0). ====== but we decided that the burden on implementors outweighed the benefits. We decided the cost of implementing the IEEE semantics on platforms where you can't count on the infrastructure to provide IEEE semantics was acceptable. We expect it's more cost-effective for everybody to converge on one floating point platform than to proliferate a variety of them. > > 2) it makes it very difficult to write validating parsers in languages that do not impose IEEE numerics (i.e. C++) than will validate consistently on different platforms. Basically, it means that you > have to write your own atof() and floating point comparision routines (on top of long or quadword types) since you cannot depend on the native float and double to behave consistently with IEEE. > > 3) it requires conversion from text to a float/double type for validation. With the abstract real type type, you could do constraint checking lexically which should be substantially faster than > conversion to a floating point and then comparision. What do you mean by an 'abstract real type'? The way it was specified in earlier drafts didn't make sense to the WG, upon examination. > Numeric conversion can very easily dwarf both parsing and DOM creation in time. I've been meaning to develop and publish some benchmarks for > this. If you don't expect the receiver to convert to an IEEE floating point representation, I suggest you use the decimal type (or some type derived from it) rather than float/double. > 4) It doesn't support more precise numeric representations. > > There have been several threads on floating point issues on the schema comments list, the last significant thread was http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0043.html > > The following messages consider the deleted minAbsoluteValue facet which was a the first move toward binding real to a specific implementation. > > http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0024.html > http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999JulSep/0052.html -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Monday, 14 February 2000 14:02:55 UTC