Re: [DTB] issues with numeric casting/built-in functions from Jos de Bruijn on 2009-05-04 (public-rif-wg@w3.org from May 2009)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Mon, 04 May 2009 20:20:04 +0200
To: Sandro Hawke <sandro@w3.org>
CC: RIF <public-rif-wg@w3.org>
Message-ID: <49FF31D4.7090709@inf.unibz.it>
Sandro Hawke wrote:
>> XPath functions and operators essentially says that each implementation
>> can decide for itself which length for decimals it supports.
> 
> As long as it's at least 16 digits
> http://www.w3.org/TR/xmlschema11-2/#partial-implementation
> 
>> For
>> example, implementation A could support decimals of length 16, while
>> implementation B supports decimals of length 20.  In addition, each
>> implementation can use its own rounding algorithm for representing
>> numbers that need larger decimals.  For example, implementation A could
>> truncate numbers, while implementation B rounds them.
>>
>> this poses problems for us when defining things like casting functions
>> and arithmetic operations.
>> for example, according to XPath casting a string
>> "0.11111111111111111111111111111111111111111111111111111111111111111111111111
>> 11111111111111111111111111"
>> to a decimal results either in an error or a decimal of some length
>> (given by the implementation) that is obtained from the number
>> corresponding to the string by some arbitrary rounding algorithm.
>>
>> It is not possible for us to define this kind of behavior in DTB,
>> because functions are defined as functions: you have some input values
>> that define an output value.
>> In addition, it is really bad for interchange, since some
>> implementations do something, while other implementations do something
>> else, and you get no warning.
>> So, for casting, I propose to define the xs:decimal casting function
>> such that the result of casting a string to a decimal is simply the
>> number with the input string as lexical representation, and so we have
>> no exception behavior.
>>
>> We might define conformance such that implementations only need to
>> support decimals of a particular length.
> 
> Am I going to be able to use an XPath number handling library for this?

I guess there is no such thing, because XPath does not tell you what to do.

> Or would you be making me round numbers differently?  

Again, XPath does not tell you how to round.

> 
> I guess 0.6666666666666666 (17 digits) might reasonably round to
> 0.6666666666666667 (16 digits), but that wouldn't conform to the
> definition you're proposing...

Indeed. And XPath does not tell you what to do at all.

> 
> This functions-are-functions thing is worrying me, too.  This means list
> "union", "intersect", and "except" need to be defined with stable
> ordering, I think, which I believe means they're stuck with n^2
> algorithms.  Is it really impossible to allow for non-determinism /
> non-specification on these things?  I had been expecting the order of
> lists returned from these operations to be undetermined (which would
> allow an implementation to really use hash tables to b-trees behind the
> scenes.)  Hmmm, I guess there are tricks one could still do -- add a
> list-index field, and then sort the final result by that field -- but
> ouch....
> 
>> The problem with numeric functions are similar.  With addition,
>> subtraction, and multiplication we run into the same problem.  Again, I
>> propose to define the functions such that the output values are simply
>> the decimals which are the result of the arithmetic operations and not
>> from some implementation-dependent modification.
>> With division it gets a bit more complicated.  For example, there is no
>> decimal that can represent the result of dividing 1 by 3, because there
>> are no infinite-length decimals. If we had owl:real we could still
>> properly defined the division function, although there is no syntactical
>> representation for the result. I have two possible solutions for you:
>> (1) We reintroduce owl:real and use it for the definition of
>> numeric-divide (I think we need it only there).
>> (2) We define the domain of numeric-divide such that only pairs of
>> numbers a,b (if they are decimals) are included if a/b can be
>> represented using a decimal. This means that if 1,3 are the arguments,
>> the value of the function is not specified by DTB and is left up to the
>> implementation.
>>
>> I think these are crucial issues and we need to have at least an idea of
>> where we want to go before DTB can go to last call. Otherwise, it will
>> not be possible to make any RIF implementations.
> 
> FWIW, I don't think this is all that crucial.   *shrug*

Well, if things are not defined, people have no idea what to implement.


Jos

> 
>       - Sandro

-- 
+43 1 58801 18470        debruijn@inf.unibz.it

Jos de Bruijn,        http://www.debruijn.net/
----------------------------------------------
Many would be cowards if they had courage
enough.
  - Thomas Fuller
Received on Monday, 4 May 2009 18:20:56 UTC