Re: [DTB] issues with numeric casting/built-in functions from Jos de Bruijn on 2009-05-04 (public-rif-wg@w3.org from May 2009)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Mon, 04 May 2009 21:01:22 +0200
To: Axel Polleres <axel.polleres@deri.org>
CC: Sandro Hawke <sandro@w3.org>, RIF <public-rif-wg@w3.org>
Message-ID: <49FF3B82.3090404@inf.unibz.it>
Axel Polleres wrote:
> I think we are overshooting here and disagree that we should restrict
> the domain. If we want to adopt/fit with XPAth/Xquery funcs and ops and
> that was the rationale we took, we should not redefine them. We had to
> do some tqists already, since we don't have errors, but let's not make
> things worse/even more diverging. At some point we have to accept
> implementation dependence in built-ins and I think following
> XPath/XQuery keeps this within reasonable bounds.

This means the RIF language is ill-defined and there is no way to
implement it.

> 
> 
>>> I think these are crucial issues and we need to have at least an idea
>>> of where we want to go before DTB can go to last call. Otherwise, it
>>> will not be possible to make any RIF implementations.
> 
> How about if we simply mark functions/operators which have
> implementation dependence? but we shouldn't restrict them too much. (I
> think of something like the starts  in a menu which mark some dishes as
> spicy)

They still need to be well-defined. Of course, we can make DTB
implementation-dependent, but then the I in RIF probably stands of Icky.

Jos

> 
> Axel
> 
> Sandro Hawke wrote:
>>> XPath functions and operators essentially says that each implementation
>>> can decide for itself which length for decimals it supports.
>>
>> As long as it's at least 16 digits
>> http://www.w3.org/TR/xmlschema11-2/#partial-implementation
>>
>>> For
>>> example, implementation A could support decimals of length 16, while
>>> implementation B supports decimals of length 20.  In addition, each
>>> implementation can use its own rounding algorithm for representing
>>> numbers that need larger decimals.  For example, implementation A could
>>> truncate numbers, while implementation B rounds them.
>>>
>>> this poses problems for us when defining things like casting functions
>>> and arithmetic operations.
>>> for example, according to XPath casting a string
>>> "0.11111111111111111111111111111111111111111111111111111111111111111111111111
>>>
>>> 11111111111111111111111111"
>>> to a decimal results either in an error or a decimal of some length
>>> (given by the implementation) that is obtained from the number
>>> corresponding to the string by some arbitrary rounding algorithm.
>>>
>>> It is not possible for us to define this kind of behavior in DTB,
>>> because functions are defined as functions: you have some input values
>>> that define an output value.
>>> In addition, it is really bad for interchange, since some
>>> implementations do something, while other implementations do something
>>> else, and you get no warning.
>>> So, for casting, I propose to define the xs:decimal casting function
>>> such that the result of casting a string to a decimal is simply the
>>> number with the input string as lexical representation, and so we have
>>> no exception behavior.
>>>
>>> We might define conformance such that implementations only need to
>>> support decimals of a particular length.
>>
>> Am I going to be able to use an XPath number handling library for this?
>> Or would you be making me round numbers differently? 
>> I guess 0.6666666666666666 (17 digits) might reasonably round to
>> 0.6666666666666667 (16 digits), but that wouldn't conform to the
>> definition you're proposing...
>>
>> This functions-are-functions thing is worrying me, too.  This means list
>> "union", "intersect", and "except" need to be defined with stable
>> ordering, I think, which I believe means they're stuck with n^2
>> algorithms.  Is it really impossible to allow for non-determinism /
>> non-specification on these things?  I had been expecting the order of
>> lists returned from these operations to be undetermined (which would
>> allow an implementation to really use hash tables to b-trees behind the
>> scenes.)  Hmmm, I guess there are tricks one could still do -- add a
>> list-index field, and then sort the final result by that field -- but
>> ouch....
>>
>>> The problem with numeric functions are similar.  With addition,
>>> subtraction, and multiplication we run into the same problem.  Again, I
>>> propose to define the functions such that the output values are simply
>>> the decimals which are the result of the arithmetic operations and not
>>> from some implementation-dependent modification.
>>> With division it gets a bit more complicated.  For example, there is no
>>> decimal that can represent the result of dividing 1 by 3, because there
>>> are no infinite-length decimals. If we had owl:real we could still
>>> properly defined the division function, although there is no syntactical
>>> representation for the result. I have two possible solutions for you:
>>> (1) We reintroduce owl:real and use it for the definition of
>>> numeric-divide (I think we need it only there).
>>> (2) We define the domain of numeric-divide such that only pairs of
>>> numbers a,b (if they are decimals) are included if a/b can be
>>> represented using a decimal. This means that if 1,3 are the arguments,
>>> the value of the function is not specified by DTB and is left up to the
>>> implementation.
>>>
>>> I think these are crucial issues and we need to have at least an idea of
>>> where we want to go before DTB can go to last call. Otherwise, it will
>>> not be possible to make any RIF implementations.
>>
>> FWIW, I don't think this is all that crucial.   *shrug*
>>
>>       - Sandro
>>
>>
> 
> 

-- 
+43 1 58801 18470        debruijn@inf.unibz.it

Jos de Bruijn,        http://www.debruijn.net/
----------------------------------------------
Many would be cowards if they had courage
enough.
  - Thomas Fuller
Received on Monday, 4 May 2009 19:02:14 UTC