- From: <petsa@us.ibm.com>
- Date: Thu, 13 Jan 2000 16:18:21 -0500
- To: "Arnold, Curt" <Curt.Arnold@hyprotech.com>
- cc: "'ejr@CS.Berkeley.EDU'" <ejr@CS.Berkeley.EDU>, "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org>
Curt: We have changed the spec substantially in this area. The "real" datatype is gone. There are 2 new primitive datatypes corresponding to IEEE float and double. I think you will like this much better. The 12/17 public draft includes these changes. All the best, Ashok "Arnold, Curt" <Curt.Arnold@hyprotech.com>@w3.org on 01/13/2000 02:55:53 PM Sent by: www-xml-schema-comments-request@w3.org To: "'ejr@CS.Berkeley.EDU'" <ejr@CS.Berkeley.EDU> cc: "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org> Subject: Re: Floating point proposal from left field... [long] Please see my comments in the XML Schema help file available at http://www.software.aeat.com/xml/resources.htm and previous notes on NaN http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0026 . html and minAbsoluteValue in http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0024 . html. I have been lobbying hard for the removal of any facet that would require an interpreting system to try to mimic the floating point system of the sender. All the negative scenarios that I outline in the minAbsoluteValue and in the help file would also occur with the bitsExponent and bitsMantissa facets. If the bitsMantissa type facets were supported, then they should be advisory but not used to try to mimic the floating point system on the sender. The Schema space should only be concerned about lexical validation, details about specific implementation types and behavior on overflow and underflow should only be introduced in a type-aware DOM. In my perfect would "real" would be primitive, "decimal" would be derived from real (simply excludes the E+nnn fragment) and "integer" would derive from "decimal". Each of these would hint to a type-aware DOM that they should be bound to a specific datatype, however I suggested a provision for an explicit hint "appType" attribute to suggest what data type a type-aware DOM should use. In the help file and on the Xerces-dev list, I've been lobbying that the comparisions behind minInclusive, etc, should be done lexically. This should be faster than conversion to native floating point and should be consistent on all platforms where use of native floating point could cause inconsistent validation results due to the rounding issues that you were discribing. If you really wanted to conform to the behavior of a specific floating point system (which I would discourage), you can do that with existing facets (okay, you have to add some of the logical facets that I suggested in the help file). There are three ways that mimicing a particular floating point representation would modify constraints on a floating point number 1) Imposition of a maximum magnitude. If I say that a "real" must be able to be able to convert to a IEEE single precision number without overflow, that implies a min and max bound which could be enforce using lexical comparision. <datatype name="float" source="real"> <minInclusive>-1.xxxxxxxxxxxE38</minInclusive> <maxInclusive>1.xxxxxxxxxxxE38</minInclusive> 2) Loss of precision A lexical real could express more precision than binary floating point types, however except at boundaries this should not be significant. If the source knows a value to more precision than I do, then let it express all it knows and I'll keep all I can handle. 3) Loss of precision in bounds checks If I wanted to prevent divide-by-zeros, I might put in a <minExclusive value="0"/>. However, with a lexical comparision that would allow a value like "1e-500" that would cause a divide by zero when converted to a IEEE single or double precision. However, if I placed by bound at the minimum non-zero representable in whatever target binary system that I desired. My recommendations would be 1) Reestablish the real datatype as an unlimited precision and range floating point number where bound comparisions are done lexically. 2) Derive decimal from real and integer from decimal. 3) If you really feel compelled to enable constraining reals to the ranges of IEEE double and float, then add the following lexicals to the lexical space for real +Double.MAX_VALUE +Double.MIN_VALUE -Double.MAX_VALUE -Double.MIN_VALUE +Float.MAX_VALUE +Float.MIN_VALUE -Float.MAX_VALUE -Float.MIN_VALUE The lexicals only help. If you wanted to constrain to another type, all you would have to do is to type out the lexical representation of those values in your bounds. Which are equivalent as the full precision lexical representations of the IEEE boundaries Then to restrict a datatype to the range of double (and still allow Infinity and NaN), you would do something like <datatype name="double" source="real"> <!-- note: might be better to replace not with nor and nand --> <not> <!-- this clause would only be true <or> <and> <minExclusive value="-Infinity"/> <maxExclusive value="-Double.MAX_VALUE"> </and> <and> <minExclusive value="+Double.MAX_VALUE"> <maxExclusive value="+Infinity"/> </and> </or> </not> </datatype> If you wanted to restrict a value to not be greater than zero even after rounding (and allowing +NaN), it could be enforced by <datatype name="positive-double" source="real"> <not> <maxExclusive value="+Double.MIN_VALUE"/> </not> </datatype> I would would really prefer you not make double and float a generated class in schema for schema's, but you could put in an appendix that shows how you can constrain a datatype to a particular application datatype. One reason that I'd avoid putting implementation specific types that low in the heirarchy is that I envision that you will want to use the inheritance hierarchy to classify the interpretation of the type not its machine implementation. For example, <datatype name="length" source="real"> <annotation><info type="description">Length in meters.</info></annotation> </datatype> <datatype name="altitude" source="length"> <annotation><info type="description">Altitude in meters.</info></annotation> </datatype> <datatype name="altitudeRelativeToGround" source="altitude"> <annotation><info type="description">Altitude in meters relative to the ground.</info></annotation> <minInclusive value="0"/> </datatype> <!-- finally we restrict this to a specific implementation (which is totally silly) but if you feel you must --> <datatype name="floatAltitudeRelativeToGround" source="altitudeRelativeToGround"/> <not> <or> <and> <maxExclusive value="+Infinity"/> <minExclusive value="+Float.MAX_VALUE"/> </and> <and> <maxExclusive value="-Float.MAX_VALUE"/> <minExclusive value="+Infinity"/> </and> </or> </not> </datatype>
Received on Thursday, 13 January 2000 16:20:49 UTC