- From: Jeremy Carroll <jjc@hpl.hp.com>
- Date: Tue, 13 Nov 2007 14:44:43 +0000
- To: public-owl-wg@w3.org
7) rounding errors behave very differently from in traditional numeric applications - hence a solved problem (rounding) becomes an unsolved problem (I will also add another area of concern which is a mismatch between the real numbers and the XSD datatypes) === Looking at the racer documentation, racer seems to work with real numbers - however OWL works with the XSD variation which approximate real numbers in several standard ways, for example IEEE floating point numbers, in 32 (or 64) bits. When thinking about arithmetic in a knowledge representation form, we may convert miles to kilometers by multiplying by 1.6. We may think of such a conversion as a bijection. However, if we are representing miles and kilometers by xsd:float (as in say the Pan and Horrocks paper), this all falls to the floor in an ugly mess. The largest float value for miles has no corresponding kilometers value, similarly, the smallest kilometer value, has no corresponding mile value. A little bit of thought shows that either the conversion is sparse with many values being unconvertible (because they do not correspond to exact values), or the conversion is not one-to-one but many-to-one (or is it one-to-many) because the approximative nature of the intervals map more than one kilometer value to the same mile value. In this way, many of our intuitions fail; and the underlying logic of the reasoner, also fails, in ways that confuse intelligent people who understand the area very well (I think the Turner and Carroll critique of the Pan and Horrocks mile-to-kilometer conversion illustrates this). This is likely to be very confusing for the end users. If we convert from one unit to another, and then to a third, we also need to consider associativity. IEEE arithmetic has non-associative multiplication, which gives problems. In summary, IEEE arithmetic is designed for procedural purposes, and not declarative ones. Doing declarative arithmetic, in a web context, in which interaction with legacy systems, such as databases, which use IEEE formats etc, is non-trivial. Jeremy PS As an appendix to this e-mail, I include the text of section 4.2 of Turner and Carroll (this is all Dave's work). ======= An example from [Pan and Horrocks] declares that the Yangtze river is 3937.5 miles long and uses the kmtrsPerMile predicate to deduce that it is also 6300.0km long. In other words, ( 6300.0, 3937.5 ) in [kmtrsPerMile]. This example uses the XML Schema datatype float to represent lengths. Suppose that the Yangtze was declared instead to be 3937.501 miles long, then ( 6300.0015, 3937.501 ) in [kmtrsPerMile] so the Yangtze river may be deduced to be 6300.0015km long. However, ( 6300.0015, 3937.5007 ) in [kmtrsPerMile] so that the Yangtze river may also be deduced to be 3937.5007 miles long. This would be inconsistent with the user’s expectation that a river has only one length. As pointed out in [Pan and Horrocks], there are well over a hundred length units, and rounding errors caused by round-tripping values through all of the associated conversions can accumulate into significant errors. We implemented a system to do conversions between floats representing lengths in kilometers, meters, centimeters, millimeters, micrometers, inches, feet, yards, fathoms, poles, chains, furlongs, statute miles, leagues and nautical miles and deduced the length of the Yangtze to be both 6335.3584km and 6361.8555km1, and nearly 800,000 other values, starting from a declaration that its length in miles is 3937.5. These rounding errors were highly dependent on the structure of the definitions of the units, as multiplication in float is not associative so scalar multiplication operators on float do not commute. This lack of associativity also demonstrates that the (necessarily associative) composition of two datatypes like kmtrsPerMile and, say, milesPerLeague cannot be the same as the composition of the underlying arithmetic operations; again, this is likely to be inconsistent with a user’s expectations. In short, the behaviour of fixed-precision floating-point datatypes with arithmetic in OWL is likely to be a source of confusion amongst users. Additionally, suppose the Volga river were declared to be 3668.8003km long, then it would have no value for its lengthInMile property at all, since ( 3668.8000, 2293.0000 ) in [kmtrsPerMile] ( 3668.8005, 2293.0002 ) in [kmtrsPerMile] not exist x in float with 2293.0 < x < 2293.0002 Again, this situation would be contrary to the user’s expectation that one can always convert freely (albeit possibly inaccurately) between miles and kilometers. Notice that this cannot be remedied by using the arbitrary-precision decimal instead of the fixed-precision float: for example the temperature 75.0F has no corresponding decimal representation in C. In practice, many applications do not require the declarative style of arithmetic that datatypes like kmtrsPerMile would allow. Instead, a procedural approach is adequate. For example, a user may be happy that the Volga can be deduced to be 2293.0km long, and may be equally happy with 2293.0002km, as long as only one of the options is chosen. One method that has been used to achieve this would be to embed conversion instructions as literals in an ontology[9], which makes it clear to a user that the semantics of arithmetic is separated from that of the DL.
Received on Tuesday, 13 November 2007 14:45:28 UTC