From: Peter McIlroy <PeterM@nimble.com>

Date: Mon, 5 Nov 2001 15:35:08 -0800

Message-ID: <6514DE680737F449885673E7895ED25001211B13@zeus.nimble.com>

To: "'www-xml-query-comments@w3.org'" <www-xml-query-comments@w3.org>

Cc: Peter McIlroy <PeterM@nimble.com>, Denise Draper <ddraper@nimble.com>, "'simeon@research.bell-labs.com'" <simeon@research.bell-labs.com>

Date: Mon, 5 Nov 2001 15:35:08 -0800

Message-ID: <6514DE680737F449885673E7895ED25001211B13@zeus.nimble.com>

To: "'www-xml-query-comments@w3.org'" <www-xml-query-comments@w3.org>

Cc: Peter McIlroy <PeterM@nimble.com>, Denise Draper <ddraper@nimble.com>, "'simeon@research.bell-labs.com'" <simeon@research.bell-labs.com>

I'm forwarding this to the newsgroup on the advice of Jerome Simeon. The definition of XML arithmetic and aggregate functions in the XQuery proposal is troubling. In particular, the current specifications are for: double SUM [ T ] double AVG [ T ] for all types T. After having my head straightened by the QL compiler people here at Nimble, I believe that the best model for aggregate functions and operators is that all aggregates and functions will have the same type as their arguments: T SUM [ T ] T AVG [ T ] T MIN [ T ] (as it stands) T MAX [ T ] (as it stands) There are several reasons for this: (1) All aggregate functions are treated uniformly with other numeric operators. (2) It is SQL compatible, so it satisfies the principle of least surprise. (3) For DECIMAL types, the domain agrees with the range. The current proposal, that avg and sum always be type double, may cause serious roundoff errors for DECIMAL numbers. This may cause some concern in the financial industry. (4) It is a general model: if the result is desired in some other type, it can be requested explicitly, as in SQL: CREATE TABLE test(fld1 INTEGER); INSERT INTO test VALUES(...); SELECT AVG(real(fld1)) from test; Additionally, the proposed rounding methods may prove insufficiently general: The proposal is for: INT CEILING(T); INT FLOOR(T); INT ROUND(T); The SQL standard is more like: T CEILING(T), T is DOUBLE or DECIMAL (to avoid overflow) T FLOOR(T), T is DOUBLE or DECIMAL T ROUND(T, int precision) -- T is any numeric type precision is the position at which to round. Default is 0. For integers, only negative precision has any effect. You may want to include TRUNC in this list, or to follow a model more like IEEE: T ROUND(T arg, int precision = 0, string direction = 'nearest') where direction can be 'negative' (towards -Inf) 'positive' (towards +Inf) 'tozero' (truncate towards zero) 'nearest' (round to nearest, with .5 treated some uniform manner, either always away from zero, towards zero, or IEEE towards even. The current model, towards positive infinity, does not agree with) See: "man fpsetround" on solaris for more details of IEEE rounding. Sincerely, Peter McIlroy pmcilroy@nimble.com Peter McIlroy writes: > Thanks. > > There's still some problem. > > I don't think that the > > SUM [ DECIMAL ] --> double > or > AVG [ DECIMAL ] --> double > > is a safe coercion in types. > > For example, AVG[.2, .2] = .2, not some floating point approximation to 2. > > Also, if you treat SUM [DECIMAL] as a floating point, the entire financial > database community will be unhappy. > > > I've been talking with the XMLQL compiler people here, who have been working > on ways to make xml-based views on disparate data sources. > > They say that the best way to go is that the aggregation functions > take the same type as their arguments. > > They recommend that all functions be templatized as follows: > > <T> T MAX [ T ] > <T> T SUM [ T ] > <T> T AVG [ T ] > <T> T SUM [ T ] > > There's more flexibility and soimplicity in making the functions > polymorphic, > than in requiring them to have only one return type. > Then if you do want to compute a sum of integers as a double, you do > the cast on the column value, not on the result: > > SUM [ double(integer-column) ] > > Also, I am pleased to see that you are proposing to use david gay's > improved > version of the Steele & White stopping criteria for conversion of > floating point numbers to decimal. It really is the only right solution > for this problem. > > > -----Original Message----- > From: Jerome Simeon [mailto:simeon@research.bell-labs.com] > Sent: Friday, November 02, 2001 5:29 PM > To: Peter McIlroy > Cc: 'simeon@research.bell-labs.com' > Subject: Re: XQuery semantics: aggregations > > > > Hi Peter, > > Most of the arithmetic operations are now defined as a part of the > Functions and Operators document for XQuery: > > XQuery 1.0 and XPath 2.0 Functions and Operators Version 1.0 > http://www.w3.org/TR/xquery-operators/ > > Which means thoses from the semantics document should be probably > revisited. > > Let me know if the F&O document addresses your issues. > > Regards, > - Jerome >Received on Monday, 5 November 2001 18:34:28 UTC

*
This archive was generated by hypermail 2.3.1
: Tuesday, 6 January 2015 20:21:14 UTC
*