- From: Rob Shearer <rob.shearer@comlab.ox.ac.uk>
- Date: Thu, 10 Jul 2008 15:31:57 +0100
- To: Michael Smith <msmith@clarkparsia.com>
- Cc: public-owl-wg@w3.org
- Message-Id: <1A7509EC-9453-4A84-AC79-9565E60C6C61@comlab.ox.ac.uk>
> I was concerned about the large number of possible constants (e.g., > "0.01") for which there is not a directly corresponding value in the > float value space. [2] indicates to me that for some those, there are > alternative lexical to value space mappings, which could cause > problems. Let's be a bit more explicit about the exactly problems we're facing, though. Suppose the decimal strings a and b encode numeric values, and that IEEE-754-compliant rounding model X would interpret both as the floating-point value c, while a different IEEE-754-compliant rounding model would encode a as c but would encode b as the (different) floating-point value d. Then the class: (forall R (= "a"^^xsd:float)) and (exists R (= "b"^^xsd:float)) Would be satisfiable in implementation X but not in implementation Y. Two notes on this: 1. You obviously need ambiguous values for this problem to arise, and the IEEE spec only allows ambiguity for pretty wacky numbers. Anything representable (not just represented, but representable) in decimal as ±M × 10^{±N} for M < 10^9 and N < 14 is unambiguous (M < 10^17 and N < 28 for double-precision), so users need to type a lot of digits before they start experiencing the problem. 2. A single ambiguous value does not in itself produce a problem. Problems only arise in ontologies which use one ambiguous value, as well as other values within the range of IEEE-754-legal interpretations of that value. Given that the IEEE-754-legal range is quite small (a limited error is allowed only in the least significant digit of the destination type, which I read as the last bit of the float), this issue is unlikely to arise very often. I agree that this situation kind of sucks, but I don't see any viable alternatives. The non-viable ones I can come up with are: 1. Disallow floating-point numbers entirely. This seems like a non- starter---the vast vast majority of scientific data makes use of such numbers. 2. Only allow unambiguous floating-point representations. This seems a clear violation of the XSchema semantics. What is more, the implementation burden seems high; libraries usually don't have this functionality. I wouldn't know how to implement this, and I strongly suspect many implementors would either not implement it at all or get it wrong. 3. Impose explicit rounding rules above and beyond IEEE-754. This would break every floating point implementation in existence. I would encourage Oxford to object to such a model, and I can't imagine such a proposal passing a vote of the AC. While I'd love to hear any "real" solutions, this does seem like the kind of thing appropriate for a lint-like tool. It's a very rare and complex situation that would be hard to enshrine in the specification, but a tool could easily warn users of potentially problematic use, and the tool could even easily repair this use by rewriting values to unambiguous values. Other suggestions? -rob
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 10 July 2008 14:32:34 UTC