- From: Carsten Lutz <clu@tcs.inf.tu-dresden.de>
- Date: Wed, 14 Nov 2007 20:09:00 +0100 (CET)
- To: Jeremy Carroll <jjc@hpl.hp.com>
- Cc: public-owl-wg@w3.org
On Tue, 13 Nov 2007, Jeremy Carroll wrote: > > 7) rounding errors behave very differently from in traditional numeric > applications - hence a solved problem (rounding) becomes an unsolved problem > > (I will also add another area of concern which is a mismatch between the > real numbers and the XSD datatypes) > > === This is an interesting subject. I don't think it is really about "rounding", though, and neither is it really about n-ary datatypes. I think what your example shows is that using bounded or fixed-precision datatypes such as float or decimal in the semantics of an ontology language is a bad idea. This means that already the unary datatypes in OWL 1.0 are in some sense broken, and the XML Schema datatypes are probably *not* the right thing to use in OWL. Sorry if this sounds like blasphemy. Let me explain in more detail what I mean, singling out a number of subjects. ---------------------------------------------------------------------- 1. Not about rounding Take your example > An example from [Pan and Horrocks] declares that the Yangtze river is 3937.5 > miles long and uses the kmtrsPerMile predicate to deduce that it is also > 6300.0km long. In other words, > ( 6300.0, 3937.5 ) in [kmtrsPerMile]. > > This example uses the XML Schema datatype float to represent lengths. Suppose > that the Yangtze was declared instead to be 3937.501 miles long, then > > ( 6300.0015, 3937.501 ) in [kmtrsPerMile] > > so the Yangtze river may be deduced to be 6300.0015km long. However, > > ( 6300.0015, 3937.5007 ) in [kmtrsPerMile] > > so that the Yangtze river may also be deduced to be 3937.5007 miles long. > This would be inconsistent with the user?s expectation that a river has only > one length. I think what happens here is indeed bad, but it is something different from what is claimed. Namely, as far as I can see, rounding is not explicitly addressed in the semantics of OWL1.0 and in XML Schema, and neither in the proposed semantics of OWL 1.1 (correct me if I'm wrong). Thus, rounding does *not take place*. Instead, there is a "gap" in the datatype. If reasoners do rounding, they actually violate the semantics. Let's make your example a bit more precise. Assume Yangtze is an individual and a I use a unary datatype predicate "=3937.501" on the datatype property "lengthInMiles", i.e., is say that Yangtze is connected to the concrete object 3937.501 via "lenthinMiles" b Now I use a binary predicate datatypre predicte "milesToKm" on Yantze, with first argument "lengthInMiles" and second argument "lengthInKm". I do this using a dataPropertyAssertion, which has an existential semantics. Thus, b stipulates that there *is* a float that corresponds to 3937.501 in kms, but in fact there isn't (be cause we would need to do rounding to make it a legal float, which we don't). What do we get? An inconsistency. This is different from, but not much better than, two different values for the length. ---------------------------------------------------------------------- 2. Not about n-ary datatypes The point of your example above is that the fixed precision of float produces unexpected results, namely an inconsistency. This already happens with unary datatypes, and even with those that are definable in XML Schema. Take e.g. float. Let n be the largest float that is smaller than 1. Such a float exists since there are only finitely many floats. Now assume that a user defines two unary predices, one that is true for all floats strictly greater than n, and one for all floats strictly smaller than 1. Asserting the existence of a data value in the intersection of these two predicates leads to an inconsistency. But this is not what a user excepts since (s)he is working with an ontology, i.e., doing a *conceptual modelling*, so she should abstract away from details such as representation of numbers and think of floats as rationals or reals. These are dense, so the user expects the above intersection to be non-emoty. So, we get unexpected results already with unary datatypes. The issue here is not the arity. It is the boundedness / fixed precision of XML Schema datatypes. ---------------------------------------------------------------------- 3. XML Schema is not a good choice for defining datatypes. XML Schema is a schema language for XML, i.e., it describes semi-structured data stored in the form of an XML document. It's very good for that purpose, because if you store data, it is important to consider the details of storage, which usually involves boundedness, fixed precision, and rounding. *We* are *not* defining a schema language for (stored) data in the sense of XML Schema. So it is a valid question whether or not the XML Schema datatypes are also good for our (different) purposes. I believe they are not. We are defining an ontology language with a declarative semantics. As the above examples show (and there are tons more), we get all sorts of oddities from the combination of (i) an expressive logic that has a declarative semantics and (ii) the bounded datatypes of XML schema. As the literature on concrete domains in description logics shows, there are no such problems if you work with unbounded datatypes such as the integers or the rationals (the reals are problematic for reasons that are not related to boundedness or unboundedness; not to be discussed here). I agree with Jeremy that these problems, which are already present in OWL 1.0, get more relevant when switching from unary to n-ary datatypes. Still, it seems strange to argue against n-ary datatypes based on a problem that is present already with unary datatypes and that, when defining OWL 1.1, we have the chance to fix. greetings, Carsten -- * Carsten Lutz, Institut f"ur Theoretische Informatik, TU Dresden * * Office phone:++49 351 46339171 mailto:lutz@tcs.inf.tu-dresden.de *
Received on Wednesday, 14 November 2007 19:09:19 UTC