- From: Rob Shearer <rob.shearer@comlab.ox.ac.uk>
- Date: Thu, 10 Jul 2008 15:31:57 +0100
- To: Michael Smith <msmith@clarkparsia.com>
- Cc: public-owl-wg@w3.org
- Message-Id: <1A7509EC-9453-4A84-AC79-9565E60C6C61@comlab.ox.ac.uk>
> I was concerned about the large number of possible constants (e.g.,
> "0.01") for which there is not a directly corresponding value in the
> float value space. [2] indicates to me that for some those, there are
> alternative lexical to value space mappings, which could cause
> problems.
Let's be a bit more explicit about the exactly problems we're facing,
though. Suppose the decimal strings a and b encode numeric values, and
that IEEE-754-compliant rounding model X would interpret both as the
floating-point value c, while a different IEEE-754-compliant rounding
model would encode a as c but would encode b as the (different)
floating-point value d. Then the class:
(forall R (= "a"^^xsd:float)) and (exists R (= "b"^^xsd:float))
Would be satisfiable in implementation X but not in implementation Y.
Two notes on this:
1. You obviously need ambiguous values for this problem to arise, and
the IEEE spec only allows ambiguity for pretty wacky numbers. Anything
representable (not just represented, but representable) in decimal as
±M × 10^{±N} for M < 10^9 and N < 14 is unambiguous (M < 10^17 and N <
28 for double-precision), so users need to type a lot of digits before
they start experiencing the problem.
2. A single ambiguous value does not in itself produce a problem.
Problems only arise in ontologies which use one ambiguous value, as
well as other values within the range of IEEE-754-legal
interpretations of that value. Given that the IEEE-754-legal range is
quite small (a limited error is allowed only in the least significant
digit of the destination type, which I read as the last bit of the
float), this issue is unlikely to arise very often.
I agree that this situation kind of sucks, but I don't see any viable
alternatives. The non-viable ones I can come up with are:
1. Disallow floating-point numbers entirely. This seems like a non-
starter---the vast vast majority of scientific data makes use of such
numbers.
2. Only allow unambiguous floating-point representations. This seems a
clear violation of the XSchema semantics. What is more, the
implementation burden seems high; libraries usually don't have this
functionality. I wouldn't know how to implement this, and I strongly
suspect many implementors would either not implement it at all or get
it wrong.
3. Impose explicit rounding rules above and beyond IEEE-754. This
would break every floating point implementation in existence. I would
encourage Oxford to object to such a model, and I can't imagine such a
proposal passing a vote of the AC.
While I'd love to hear any "real" solutions, this does seem like the
kind of thing appropriate for a lint-like tool. It's a very rare and
complex situation that would be hard to enshrine in the specification,
but a tool could easily warn users of potentially problematic use, and
the tool could even easily repair this use by rewriting values to
unambiguous values.
Other suggestions?
-rob
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 10 July 2008 14:32:34 UTC