Re: ISSUE-126 (Revisit Datatypes): A new proposal for the real <-> float <-> double conundrum

>> Most importantly, I do not think there is necessarily a direct  
>> correlation between the lexical representations used to represent  
>> particular values and the value spaces in which those particular  
>> values live. I.e. users want to be able to specify particular  
>> values within the `real` value space using `xsd:float`,
>
> You mean the type name or the lexical syntax (e.g., "12.78e-2")?

XSD offers a lexical syntax for points that happen to lie on the real  
number line---that's what I suggest using it for. The easiest approach  
is that xsd names on their own are not valid "datatypes"; particular  
values encoded using xsd, however, are (because particular values are  
single-element value spaces).

> I'm personally more comfortable with allowing the latter than  
> pushing "xsd:float" as a synonym for the real value space. Your  
> milage obviously varies.
>
>> but they do *not* have any interest in use of the `xsd:float` value  
>> space.
>
> Some do at least to the extent of wanting NaN (and perhaps -0). I'd  
> personally prefer not to shove them into the real type (certainly  
> NaN; I suppose we could make our reals the affine reals and handle  
> +inf).

I'd endorse including only one zero, but I agree there's an issue with  
NaN. My principled stand is that it's inconsistent (a value space of  
size zero), but I'd definitely want to analyze the use cases to see  
who loses important functionality from that decision.

But my main point is that users have no interest in the "holes"  
introduced by the xsd:float value space: providing them access to a  
value space of numbers representable in float representation is not  
useful, and could lead to lots of confusion, particularly if users  
could easily use such a space "by accident". That's the situation  
we've fallen into with floats in OWL 1.0.

>> Thus we've got two orthogonal concepts which happen to coincide for  
>> strings and integers but not for real numbers.
>>
>> My proposed solution would be to use brand-new OWL names for all  
>> value spaces, but use xsd syntax to specify particular values.
>
> Could you say what you think the lexical space of the reals should  
> include?

I don't know what you mean by "lexical space of the reals". I don't  
propose defining the reals lexically; I propose defining the value  
space mathematically. But implementations should allow users to  
specify particular points in that value space using the lexical  
representations for `xsd:float` and `xsd:int` values. I expect most  
implementations will also support points represented as `xsd:double`  
and `xsd:long` as well. I do *not* think a conformant implementations  
should have to deal with arbitrary points represented as `xsd:decimal`  
(since the vast majority of users don't need the extra  
representational power, and there is substantial implementation burden  
and performance penalty for dealing with such values correctly).

> At least, as a first cut? (It seems decimal, scientific, and  
> rational notation would all be useful, the first two for common ways  
> of writing and the third for full coverage of the rationals.)

The WG should consider that some implementations might allow lots of  
xsd syntaxes but lose precision on some of them (allow use of  
`xsd:decimal` in ontology files for user convenience, but convert them  
to floats during parsing)---thus a vocabulary for what it means to  
"support" a numeric xsd type for particular values would be useful. My  
big concern here is that an ontology will be developed and tested with  
a reasoner with "full" `xsd:decimal` support but then when it's used  
with an implementation with "imprecise" `xsd:decimal` support  
everything goes pear-shaped. Spitting out warnings during parsing  
isn't a great solution...

And of course some implementations might offer additional value spaces  
as well, but I'd like the spec to make it very clear that this is a  
very different thing than the above. For one thing, I'd suggest  
outlawing any use of names within the xsd namespace for value spaces,  
even spaces implementors have added as extensions. "Support for  
`xsd:decimal`" should mean `xsd:decimal` syntax for points on the real  
number line and nothing else.

-rob

Received on Sunday, 6 July 2008 19:08:05 UTC