Re: Views on the outcomes of F2F from Steve Harris on 2009-11-10 (public-rdf-dawg@w3.org from October to December 2009)

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 10 Nov 2009 17:57:36 +0000
To: Andy Seaborne <andy.seaborne@talis.com>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <4CBA98A8-C2B7-4E21-B72A-4DDB6F9EFFBE@garlik.com>
Responses to stuff where I have feelings inline, others snipped.

On 10 Nov 2009, at 17:20, Andy Seaborne wrote:
>
> I prefer HAVING because familiarity with SQL.
>
> Having both is acceptable.

I strongly prefer having just one. I feel that as we've already used  
FILTER, and the may-use-aggregate-functions one appears somewhere else  
syntactically, that just using FILTER again is better, but could be  
persuaded otherwise.

I think WHERE being optional and isURI/isIRI should be learning  
experiences here.

> > **  DISTINCT in aggregate functions
> >
> > Consensus on allowing DISTINCT with multiple arguments to aggregate
> > functions. DISTINCT in this case passes just the DISTINCT tuples
> > into the aggregate function (for each group).
>
> I'm unclear why it should be allowed in SUM or AVG.  Is there a use  
> case?

Don't know, I think it was mostly for consistency. SQL allows it  
anywhere. MIN/MAX are the screw cases, as it does nothing.

> > **  ISSUE-16: Mixed datatypes with built-in aggregates
> >
> > Consensus that MIN/MAX should use same semantics as ORDER BY,
> > with parts (e.g. ordering xsd:string and xsd:dateTime) being
> > undefined/implementation defined.
>
> I think this will get confusing with mixed data "1", "9", 1, 2, 3  
> but may be acceptable.  (Multivaluespace handling is still my  
> preferred design.)

There was a strong preference for aggregate functions returning scalar  
values.

> If eval failures, are "not in group", casting is OK but the document  
> must talk about this.

I think this was discussed, and I think the consensus was that they  
are in the group, but really, really not sure.

> > Consensus that SUM/AVG should use same semantics as +
>
> Clarification: errors not in a group means that what would be
>
> 1 + error + 2 => 3
>
> which is not the same as +

Yep, which I think is why they are in the group, and why COALESCE is  
important.

> > **  Syntax for expressions in SELECT list
> >
> > General lack of satisfaction with either:
> >
> >     * Requiring commas if a projection uses at least one expression
> >     * Wrapping expressions and aliases with parentheses (brackets)
>
> Would like to allow optional commas everywhere (SELECT, GROUP BY,  
> ORDER BY).

I don't think this is a particularly good idea. Haven gone without  
commas in the first place we should just stick to it I think. If we  
could rewind time, it would be different.

> > **  Sub-asks and sub-selects in FILTER
> >
> > General consensus (kasei, axel, steveh, leef) to avoid
> > the complexity of any subqueries in FILTERs.
>
> Agreed - the meaning of patterns (scoping of free variables) would  
> need join-like semantics and is complex.  The lack of scalar  
> subSELECTs will be a potnetial area for consideration problem but is  
> mitigated by having named variables in SPARQL.
>
> You can place the scalar select just be for the FILTER and AS the  
> result into a variable.  This is not an equivalence, the query  
> pattern may be slight different, but you can get the effect as far  
> as I can determine.
>
> Sub-Ask is not the same as (NOT) EXISTS because EXISTS isn't join-ed  
> with other results.

I think it was included in the discussion. The objection was largely  
around the triple-in-FILTERs syntax.

> > **  ISSUE-39: Variable scope of alias variables
> >
> > Consensus that variables on the right-hand side of "AS" (alias  
> variables) are not in scope for the rest of the query (including  
> projected expressions), but not including outer queries of course.
>
> Disagree - this is an unnecessary restriction and results in needing  
> addition nesting of SELECTs just to reuse an expression.

This was based on what existing systems do.

> > 4. we'll have one update statement, DELETE ... INSERT ...  
> WHERE ..., where one of DELETE or INSERT may be ommitted, and WHERE  
> is optional, and multiple of these may be combined in a string using  
> ";" as the separator. link
>
> I now prefer DELETE WHERE {}, that is, the pattern becomes the  
> template.

Yes, me too.

> This also means ";" is unnecessary.  If a syntax requires the use of  
> ";" to distinguish two different forms, then I would be very worried  
> (it's going to be error prone).
>
> Optional ";" is tolerable for convenience but it's used in Turtle  
> with an abbreviation meaning.

I feel differently. I think optional bits of syntax, which mean  
nothing, are a very bad idea (with the benefit of hindsight), but  
would be happy to see it be mandatory, if it aids readability.

Looking back I feel that the lack of commas, and making WHERE optional  
were the two biggest syntax mistakes in SPARQL 1.0.

- Steve

-- 
Steve Harris, CTO, Garlik Limited
2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10  
9AD
Received on Tuesday, 10 November 2009 17:58:11 UTC