Re: ISSUE-126 (Revisit Datatypes): A new proposal for the real <-> float <-> double conundrum from Bijan Parsia on 2008-07-04 (public-owl-wg@w3.org from July 2008)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Fri, 4 Jul 2008 15:53:23 +0100
To: "Boris Motik" <boris.motik@comlab.ox.ac.uk>
Cc: "'OWL Working Group WG'" <public-owl-wg@w3.org>
Message-Id: <2FFE3324-ABEE-4D7C-BA69-D661E3EA3539@cs.man.ac.uk>
On 4 Jul 2008, at 15:18, Boris Motik wrote:

> OK, so what is the case for keeping the interpretation finite?

I believe I've already made the case. The case rests on two points:
	1) Semantically, they are finite.
	2) Implementations already have to deal with large finite datatypes  
via user defined ranges on integers. Thus, this makes nothing *in  
principle* worse.

Now, having such a datatype built in raises an affordance toward  
dangerous practice. But I'd rather handle that with best practices  
and advice, rather than mucking with the semantics in absence of  
extensive experience.

> I really believe this is difficult to implement correctly,

I've no doubt of that whatsoever. I've conceded it.

> and I repeat my example (slightly modified).
>
> (1) PropertyRange( a:prop
>         DatatypeRestriction( xsd:float
>             minExclusive "n1"^^xsd:float
>             maxExclusive "n2"^^xsd:float
>         )
>     )
> (2) n1 is a constant that corresponds to the number 1 * 2^-149
> (3) n2 is a constant that corresponds to the number 3 * 2^-149
>
> (4) ClassAssertion( MinCardinality( 2 a:prop rdfs:literal) a:i )
>
> This ontology is unsatisfiable: the range of a:prop contains only  
> one object, but (4) requires existence of two different objects.
> The difficulty in detecting this is that you need to count how many  
> numbers are there between n1 and n2. How are you going to do that?  
> The binary representation of floats is really cumbersome to deal with.

Er...floats (and doubles) are inherently binary.

I'm confused. The floats have (at least) three issues:
	1) Exactness of operations (irrelevant since we don't have equations)
	2) Finiteness of the set (fixed range and discrete)
	3) Difficulties with the representation (which I always associated  
with 1)

2 always struck me as the big problem. I pointed out that we already  
have this problem with integers and user defined types. I'm confused  
by your new point.

Please clarify whether it's 2 or 3 you are concerned with. If 2,  
please explain how it's different from other finite type ranges.

> I firmly believe that, if we stick to the continuous

Discrete, I believe you mean.

> implementation of floats, NO reasoner on this earth will implement  
> this test
> case correctly. I am open to others showing me wrong.
>
> Furthermore, I firmly believe that such inferences are irrelevant  
> for practice.

We have a continuous floating point type already (decimal). We can  
always defined a fixed range (with < and > facets) and a length (with  
the digit limiting thing), so we can always recover a float like type  
from decimals.

Given that we have a continuous floating point type already, the  
reason for picking float can rest on two bits:
	The weird extra constants (e.g., NaN)
	The discreteness.

These two things are not separate, of course. The discreteness  
(combined with inexactness of representation) are a driver for the  
constants.


> Finally, I don't see a point in producing a spec for which we know  
> nobody will implement correctly and that is irrelevant for
> practice.

I know neither of these things. I suspect it's likely. But I'd rather  
guide people away from using floats than to distort their meaning. As  
I said, I can imagine wanting to reason about floats as floats (e.g.,  
for analyzing certain computational processes). I don't know if that  
would work out or not, frankly, but given that we have ways to work  
around floats (e.g., 'use decimal, people!') I don't see the point of  
mucking with the semantics.

> The simple solution

It's a simple solution, certainly.

> is to make floats continuous and be done with it. Implementations  
> become trivial and users will never notice the
> difference anyway (partly because they won't care and partly  
> because implementations will assume a continuous interpretation
> anyway).

I'm uncomfortable with this because it's based on some (plausible)  
assumptions that I don't know are true. I'd rather encourage people  
to use the more feasible types, or to use floats in non-dangerous ways.

If I were designing things from the ground up, I wouldn't have  
included floats at all.

Cheers,
Bijan.
Received on Friday, 4 July 2008 14:51:10 UTC