Re: ISSUE-126 (Revisit Datatypes): A new proposal for the real <-> float <-> double conundrum

On Jul 5, 2008, at 1:04 PM, Rob Shearer wrote:

>>> I'm providing you with my experience: every user I've ever spoken  
>>> to about this topic has wanted the real number line.
>>> They are used to using the xsd datatypes `float` and `double` to  
>>> represent number values, so they use these without values in OWL  
>>> to mean "some number".
>>
>> Do they mean bounded numbers? (i.e. with min and max sizes?) Do  
>> they distinguish between double and float? Do they care about  
>> NaNs? (Alan's users care about the latter.)
>
> Whether it's "forall R > 1.0^^xsd:float" or "forall R `xsd:float`"  
> they seem to intend a dense number line.

So you had user defined restrictions on floats, interesting.

> In the first case `float` is just the easiest way to specify the  
> value; in the second you can certainly argue that they should have  
> used `decimal`...but that's a pointless argument because my  
> reasoner didn't really support decimal.

That's interesting. I think part of what we need to is select a set  
of sane datatypes to require. String, Integer, reals seem reasonable.

>>> My experience is that the use of xsd datatypes as value spaces in  
>>> OWL 1.0 causes users to write what they don't mean.
>>
>> For me, this would suggest removing them or enforcing them more  
>> clearly.
>
> I'd suggest removing them.

That's where I'm heading too.

>>> My experience is that *every* ontology using `xsd:float` and  
>>> `xsd:double` without values would be better off using  
>>> `xsd:decimal`, but that the user intent was "some real  
>>> number" (and I should note that I'm against requiring support for  
>>> `xsd:decimal` values).
>>
>> Values? Or the datatype? In OWL 1, all these types were optional  
>> and poorly speced and had no documentation whatsoever. Part of the  
>> goal here is to spec well and document clearly any types we require.
>
> I would like to use doubles internally to represent points on the  
> real number line.

For what lexical syntax?

> Some homogeneous mix of internal representations is a pain. And I  
> seriously doubt that many users really care about the extra  
> representation power of `decimal`. It makes sense as an optional  
> feature reasoners can support, but it seems completely unnecessary  
> to require it in the spec---it's exactly the sort of thing I'd put  
> off implementing indefinitely under users asked for it.
>
> The reason `decimal` keeps coming up is just that it's dense.

That's true. But  there are several issues floating about, including  
the possibility of interaction between floats and cardinality. It  
seems to me that for most users, that will be a rare occurrence, even  
accidently. It certainly requires ranges of floats (since it's  
unlikely that the cardinalities required to cause a problem would be  
feasible anyway). E.g., if we had unbounded binary numbers then such  
floats would be no harder than integers.

> So are we using the xsd spec as an excuse to conflate density with  
> complex internal representations?

I don't think so.

[snip]
> Referring any user over to that spec to understand value spaces is  
> obnoxious and counter-productive:

We definitely don't intend to do that, I hope. Part of our current  
effort is to make sure we carefully document the types we require and/ 
or sanction.

> even WG members seem to be having trouble grokking it. (And bravo  
> to anyone making the pedantic point that a particular value is a  
> degenerate value space.)
>
> I contend that OWL users only want a tiny tiny number of different  
> value spaces to play with: integers, strings, and reals.

I certainly agree that these are key. I think the group agrees too.  
The other types are something of a legacy.

> It is possible, however, they they will want a larger number of  
> ways to lexically represent particular values within these three  
> spaces.

This wouldn't surprise me at all.

> Most importantly, I do not think there is necessarily a direct  
> correlation between the lexical representations used to represent  
> particular values and the value spaces in which those particular  
> values live. I.e. users want to be able to specify particular  
> values within the `real` value space using `xsd:float`,

You mean the type name or the lexical syntax (e.g., "12.78e-2")? I'm  
personally more comfortable with allowing the latter than pushing  
"xsd:float" as a synonym for the real value space. Your milage  
obviously varies.

> but they do *not* have any interest in use of the `xsd:float` value  
> space.

Some do at least to the extent of wanting NaN (and perhaps -0). I'd  
personally prefer not to shove them into the real type (certainly  
NaN; I suppose we could make our reals the affine reals and handle  
+inf).

> Thus we've got two orthogonal concepts which happen to coincide for  
> strings and integers but not for real numbers.
>
> My proposed solution would be to use brand-new OWL names for all  
> value spaces, but use xsd syntax to specify particular values.

Could you say what you think the lexical space of the reals should  
include? At least, as a first cut? (It seems decimal, scientific, and  
rational notation would all be useful, the first two for common ways  
of writing and the third for full coverage of the rationals.)

[snipped lots of useful details]

Thanks very much for those. I find them extremely helpful.

> Thanks for the feedback.
>>
>> Cheers,
>> Bijan.
>
> And if you're going to request further comment from a member of the  
> public, could you please do it on a list to which the public can  
> post? Shifting back to the WG list excludes me from comment.

D'oh! Sorry. That was an accident. My apologies.

> (Which is fine if you don't address questions directly to me.)

Thanks again for the discussion.

Cheers,
Bijan.

Received on Sunday, 6 July 2008 18:07:13 UTC