RE: ACTION-24: aggregate functions with multiple answers



> -----Original Message-----
> From: Gregory Williams [mailto:greg@evilfunhouse.com]
> Sent: 11 May 2009 17:37
> To: Seaborne, Andy
> Cc: SPARQL Working Group
> Subject: Re: ACTION-24: aggregate functions with multiple answers
> 
> A couple of questions...
> 
> On May 11, 2009, at 10:36 AM, Seaborne, Andy wrote:
> 
> > Choices for dealing with this include:
> >
> > 1/ The value space for MIN is the value space of the first
> > encountered datatype and everything incompatible is ignored.
> >
> > 2/ The value space has to be given - there is no single "MIN"
> > operation:
> > e.g. MIN(xsd:dateTime, ?x)
> 
> In what way is this different from the explicit casting approach
> previously mentioned (MIN(xsd:date(?x))? Do you imagine that they
> would handle differently values for which the casting fails?

My idea was that the URI indicates the valuespace and only value for that valuespace would be considered.  

Casting can turn a string into a dateTime for example.  

(This is technically a bit of a modelling error - that is the URI for the datatype, not for the value space concept.  However, a datatype can imply only one value space because a datatype is a set of values (the value space), lexicial reprsetnations and a set of facets.)
 
http://www.w3.org/TR/xmlschema-2/#datatype



A bit more thinking about and chatting to Steve on #sparql, throws up some detail around numbers.

Number obey the type promotion rules such as:
  integer+integer => integer 
  integer+decimal => decimal
  ...+double => double

It would be nice if:

MIN of 1, 2.3, 9e0 were "1"^^xsd:integer
MIN of 1, 0.3, 9e0 were "0.3"^^xsd:decimal
MIN of 1, 0.3, 9e-4 were "9e-4"^^xsd:double

But (and aside from string issues) casting is forcing the type.

MIN(xsd:integer(?x)) fails on an decimal but MIN(xsd:decimal(?x)) is a new term for an integer and always a decimal.

Steve suggested MIN_NUMBER(?x) could be defined to be the datatype of the min number found which would be nicer to users.

> 
> > 3/ There is one answer per group for each datatype encountered in
> > the group.  This means multiple rows per group.
> 
> Could the same functionality be had by allowing a DATATYPE(?x)
> expression in the GROUP BY clause?

That is clever.  That would be a good way of formalizing it as implicitly adding DATATYPE(?x) and then projecting it away.  I like this because much of the time there will only be one answer so I would like to common cases to be natural.
Numbers might need some care (again) - it's really value space but with type promotion rules.

Fortunately, I can't see a defined way for any custom datatype system to bring the same type promotion rules as well as the type system but I also can't see why a custom system can't introduce type promotion rules ad hoc.


Digressing:


Bijan - what about owl:rational and owl:real?

What's MIN of "1"^^xsd:integer and "1/3"^^owl:rational?

If owl:rational is understood, I'd expect (hope!) "1/3"^^owl:rational but I can't find any type promotion rules to appeal to.

What when there is overlap with xsd:decimals E.g.  "1/2"^^owl:rational?  

(Same for addition etc etc - are there type promotion rules?).

 Andy

> 
> thanks,
> .greg

Received on Tuesday, 12 May 2009 15:05:05 UTC