Re: ISSUE-126 (Revisit Datatypes): A new proposal for the real <-> float <-> double conundrum

>> XSD offers a lexical syntax for points that happen to lie on the  
>> real number line
>
> It offers several and we're free to define one for owl:real. If we  
> use any decimal notation, we have exactness problems (e.g., 1/3),  
> but decimal is very user friendly. So, I was thinking that the valid  
> syntax for a real would be decimal floating points and ratios of  
> integers. We could include scientific notation as well.

Why on earth would the OWL group come up with their own syntax for  
encoding numbers? The XSchema guys have already done that, and people  
have implemented parsers for their spec. If there's going to be a  
syntax for rationals or algebraics, then that seems to be right up  
their alley.

>> But my main point is that users have no interest in the "holes"  
>> introduced by the xsd:float value space: providing them access to a  
>> value space of numbers representable in float representation is not  
>> useful, and could lead to lots of confusion, particularly if users  
>> could easily use such a space "by accident".
>
> Well, you'll get exactness holes with binary or decimal notation,  
> regardless of density issues.

I thought I had made my proposal clear on this: the value space does  
not have holes. The representations supported for particular values  
are not sufficient to address all the points in that space, but the  
space itself does *not* have holes.

>> I don't know what you mean by "lexical space of the reals".
>
> XSD datatypes have a lexical space (e.g., the syntax) and a value  
> space. You are suggesting, I thought, that we adopt a value space  
> that is the reals and something about using xsd syntax (i.e.,  
> lexical spaces) for the syntax.

For the syntax of particular values. I keep trying to stress that  
values spaces should be kept separate from the syntax used for  
particular values.

> XSD offers exact syntax only for binary and decimals (I believe it's  
> exact for binary). I was wondering what sort of lexical space you  
> want.

XSD offers a well-defined mapping from lexical representation to IEEE  
floats. XSD defines an *exact* value for each valid lexical  
representaion. You may not like the way the mapping is defined  
(because the value of "1.1e0^^xsd:float" on the real number line is  
not equal to the value of "1.1^^xsd:decimal"), but there is no  
imprecision whatsoever about what each string represents. I am  
satisfied with the work the XSchema group did on floating-point  
lexical representations.

>> But implementations should allow users to specify particular points  
>> in that value space using the lexical representations for  
>> `xsd:float` and `xsd:int` values.
>
> So you want a very broad lexical space for our real type, i.e., "1",  
> "1.0",  and "12.78e-2".

No. I want `real` to be a value space with no lexical connotations.
I want to be able to specify a particular point in this value space  
using a string such as "1.0e0^^xsd:float".
The XSD lexical forms are not "the lexical space for reals". There is  
no such thing as "the lexical space for reals". There is such a thing  
as "the space of lexical representations which a conformant  
implementation must support for particular values in the real value  
space", but this space is much smaller than the real value space.

> If we want exactness for the rationals, we need either to allow  
> repeating (e.g., 0.333repeating) (usually done with a macron) or  
> fraction syntax (e.g., 1/3).

I don't intend to support exactness for rationals. A conformant  
implementation should only be required to provide exact support for  
`xsd:int` and `xsd:float` values.

>> I expect most implementations will also support points represented  
>> as `xsd:double` and `xsd:long` as well.
>
> You mean their syntax, i.e., their lexical space.

Supporting these syntaxes means that reasoners must also support  
reasoning with the particular values representable in those syntaxes.  
Support for additional syntaxes does not change the underlying  
semantics of the real number line, but it might make implementation of  
those semantics a bit harder.

> (Sorry for using the XSD terminology, but I think it's a bit clearer  
> if we stick to it for the moment.)
>
>> I
>> do *not* think a conformant implementations should have to deal  
>> with arbitrary points represented as `xsd:decimal` (since the vast  
>> majority of users don't need the extra representational power, and  
>> there is substantial implementation burden and performance penalty  
>> for dealing with such values correctly).
>
> Given that more and more languages (e.g., Java) now bundle a decimal  
> type with their core libraries, I'm not so clear on the first.

I'm not sure Java is an example of "more and more languages". In fact  
it is the flagship "you only ever need one language" proposal. And  
even in super-OO Java you have to program differently if you're going  
to play with polymorphic numbers than you would if you stuck to ints  
and floats.

I'd like to write a distributed OWL reasoner in Erlang. But Javascript  
and C are perhaps more persuasive counterexamples to your argument.

> I'd like to hear more about the second.

The most efficient bignum and decimal libraries are an order of  
magnitude slower than corresponding int and float calculations.  
Hardware is good with ints and floats.

>> ---thus a vocabulary for what it means to "support" a numeric xsd  
>> type for particular values would be useful.
>
> This is what we're after. Anything we spec will be tightly specced.  
> At the moment, we only have required and optional as modalities of  
> support. I think supporting various levels of precision  (or variant  
> mapping) would be quite hard to understand.

But presumably you're making clear that implementations which  
implement some "optional" functionality, but do so in a way which  
contradicts the optional semantics, are non-compliant. If so, then  
specifying what support for additional lexical representations means  
(i.e. exact) would make clear that a product which parsed  
`xsd:decimal` but internally converted to floating point would not  
"support `xsd:decimal`" by the terms of the OWL 2.0 spec. The  
implementors could always claim "partial support", however.

> One this model, users would just have to decide between integers and  
> reals. We could have quite a wide lexical space for reals (and even  
> for integers, i.e., allow 1.0 to mean the integer 1).

I'm getting really confused what you're talking about---constants  
appearing in XML and RDF OWL 2.0 files should be typed; there's no  
need at all to guess the type based on syntax.

And of course "1.0e0^^xsd:float" and "1^^xsd:integer" are exactly the  
same point on the real number line.

> But "0.1"^^xsd:float would not be required, but also we wouldn't  
> change the meaning along the lines you suggest (we'd just be silent  
> about it). It's fairly simple to migrate old ontologies to the new  
> one with a simple converter. If enough implementations did it  
> silently, that would be information for a future group.

No idea what this means. But I'm guessing I disagree with it.

-rob

Received on Sunday, 6 July 2008 21:56:13 UTC