RE: [F&O] avg()

I agree with you, this is a mess. Some of the problems also apply to sum().
I thought we had fixed these, but it appears not.

Splitting the functionality into multiple functions, one per data type,
could certainly be a way forward. For example, it would solve the problem
that when the argument of sum() is known statically to be of type
xs:yearMonthDuration*, the result can be the xs:double value zero if the
argument is empty.

This might suggest providing the functions:

sum(numeric*) -> numeric
sum-yearMonthDurations(yearMonthDuration*) -> yearMonthDuration
sum-dayTimeDurations(dayTimeDuration*) -> dayTimeDuration
avg(numeric*) -> numeric

These would give much better static typing, and make the handling of
untypedAtomic values much more predictable.

I think we could leave the user to find an average duration "by hand"...

I don't think we should provide the ability to find an "average date" or
"average time". You can always turn them into durations by subtracting an
origin date/time, get the average duration, and then add it back.

Michael Kay


> -----Original Message-----
> From: Michael Brundage [mailto:xquery@attbi.com] 
> Sent: 17 May 2003 21:12
> To: public-qt-comments@w3.org
> Subject: [F&O] avg()
> 
> 
> 
> The definition of avg() contains some statements that are 
> contradictory and others that are unclear (at least to me).  
> For example,
> 
> (0)
> "$srcval must contain items of a single type or one if its subtypes."
> 
> Followed by a description of (for example) how to promote 
> xdt:untypedAtomic into double.  Clearly avg(xs:double(1.0), 
> xs:untypedAtomic(2.0)) is valid, but the quoted sentence 
> above would rule it out.
> 
> (1)
> "In addition, the type must support addition and division by 
> an integer. If date/time values do not have a timezone, the 
> implicit timezone provided by the evaluation context is added 
> and the adjusted normalized value is used in the calculation."
> 
> I see several problems with these two sentences:
>   (a) xs:date and xs:time do not support addition (with 
> themselves), but the second sentence implies sequences of 
> dates and times are allowed. Is avg((xs:time('01:01:01'), 
> xs:time('03:03:03'))) allowed? What about avg(xs:time('01:01:01'))?
> 
>   (b) What about avg(xs:date('1972-05-13'), 
> xs:yearMonthDuration('P31Y'))? These two support addition 
> (with each other) and division by an integer.
> 
> (2)
> Are the typing rules described here applied to the static or 
> runtime types of the arguments? For example, is the 
> expression avg((if ($someRuntimeCondition) then 
> xdt:yearMonthDuration('P1Y') else xdt:dayTimeDuration('PT1H'),
>      if ($someOtherRuntimeCondition) then 
> xdt:yearMonthDuration('P3Y') else
> xdt:dayTimeDuration('PT3H')))
> always a static type error, or will it succeed when 
> $someRuntimeCondition=$someOtherRuntimeCondition and fail otherwise.
> 
> 
> These confusions could be avoided if there were separate 
> average functions: one for the numeric types (backwards 
> compatible with XPath 1.0, non-numeric items in the sequence 
> are promoted to a numeric type); another function (or set of 
> functions) for date/time/dateTime/ 
> dayTimeDuration/yearMonthDuration values, say
>   avg-date(xs:date*) as xs:date?
>   avg-yearMonthDuration(xdt:yearMonthDuration*) as 
> xdt:yearMonthDuration? etc.
> 
> 
> Best wishes,
> 
> Michael Brundage
> Writing as
> Author, "XQuery: The XML Query Language" (Addison-Wesley, to 
> appear 2003) Co-author, "Professional XML Databases" (Wrox 
> Press, 2000)
> 
> not as
> Technical Lead
> Common Query Runtime/XML Query Processing
> WebData XML Team
> Microsoft
> 

Received on Wednesday, 21 May 2003 14:26:44 UTC