- From: Andy Seaborne <andy.seaborne@talis.com>
- Date: Tue, 10 Nov 2009 17:20:04 +0000
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Here is my initial take on what appears to have been a successful
face-to-face meeting. A lot has been moved forward.
Lee asked for specific issues to be raised one per email thread so
please change the subject if you reply to anything specifically.
This message is my pass over all the points. I'll pull out specific
issues if needed later.
I reserve the right to change my mind :-)
Andy
From notes
http://www.w3.org/2009/sparql/wiki/F2F2_Issue_Discussions
and resolutions
http://www.w3.org/2009/sparql/meeting/2009-11-03
Day 1:
> ISSUE-11: Implicit grouping
>
> Consensus on prohibiting projecting variables/functions on
> variables that are not included in the group by clause.
Agreed.
> ** ISSUE-12: HAVING vs. FILTER as keyword for limiting
> aggregate results
>
> General consensus in favor of using "FILTER" as the keyword,
> with bglimm preferring "HAVING".
I prefer HAVING because familiarity with SQL.
Having both is acceptable.
> ** DISTINCT in aggregate functions
>
> Consensus on allowing DISTINCT with multiple arguments to aggregate
> functions. DISTINCT in this case passes just the DISTINCT tuples
> into the aggregate function (for each group).
I'm unclear why it should be allowed in SUM or AVG. Is there a use case?
We are already handling * differently by aggregate and DISNTINCT seems
to only really man anything there. Are there specific motivating use cases?
Is DISTINCT allowed in custom aggregates ? If so, they have different
syntax.
I propose that DISTINCT is not allowed for custom aggregates. An
aggregate can choose to do that operation as part of it's definition but
DISTINCT and not-DISTINCT forms are two different URI to name the aggregate.
> ** Star (asterisk) in aggregate functions
>
> Consensus around only allowing asterisk as an argument to COUNT.
Agreed - applies to anything that talks about rows, which is only COUNT.
a custom aggregate can do this
agg:count()
if a custom aggregate is passed the solution, not the evaluated
arguments for each element of a group. No args means the solution would
work. The document needs to be clear on this.
I don't have a use case but it would seem strange if you can't implement
COUNT as a custom aggregate.
> ** ISSUE-15: Syntax for custom aggregates
>
> Mild opinion in favor of having no keyword or special syntax
> for custom aggregate functions
Neutral, currently.
> ** ISSUE-16: Mixed datatypes with built-in aggregates
>
> Consensus that MIN/MAX should use same semantics as ORDER BY,
> with parts (e.g. ordering xsd:string and xsd:dateTime) being
> undefined/implementation defined.
I think this will get confusing with mixed data "1", "9", 1, 2, 3 but
may be acceptable. (Multivaluespace handling is still my preferred design.)
If eval failures, are "not in group", casting is OK but the document
must talk about this.
> Consensus that SUM/AVG should use same semantics as +
Clarification: errors not in a group means that what would be
1 + error + 2 => 3
which is not the same as +
> ** What happens with type errors that are projected?
>
> Consensus that type errors that are projected should result in that
> solution being discarded.
Agreed.
> ** Trapping type errors?
>
> Consensus that COALESCE is a good way to trap errors.
Agreed but the choice of word is now obscure at best.
You can do a form of default values with this.
> ** Do expressions always need to be aliased to a named variable?
> Mild consensus that aliases should be required.
Disagree mildly. Prefer to allow engine invent them. For
results-on-wire, must be a legal variable. For API, who cares?
> ** Syntax for expressions in SELECT list
>
> General lack of satisfaction with either:
>
> * Requiring commas if a projection uses at least one expression
> * Wrapping expressions and aliases with parentheses (brackets)
Would like to allow optional commas everywhere (SELECT, GROUP BY, ORDER BY).
Prefer (?x +?y AS ?z) because some level of () are necessary for any
expression in SPARQL to keep it parsable by a wide variety of
approaches. So might as well include the AS. This is now the leading
approach out there.
> ** Sub-asks and sub-selects in FILTER
>
> General consensus (kasei, axel, steveh, leef) to avoid
> the complexity of any subqueries in FILTERs.
Agreed - the meaning of patterns (scoping of free variables) would need
join-like semantics and is complex. The lack of scalar subSELECTs will
be a potnetial area for consideration problem but is mitigated by having
named variables in SPARQL.
You can place the scalar select just be for the FILTER and AS the result
into a variable. This is not an equivalence, the query pattern may be
slight different, but you can get the effect as far as I can determine.
Sub-Ask is not the same as (NOT) EXISTS because EXISTS isn't join-ed
with other results.
> ** Sub-constructs in FROM and FROM NAMED
>
> General consensus (kasei, steveh, leef) to avoid the complexity of
sub-constructs in FROM. Axel is in favor, but willing to cede the point.
Agreed.
> ** ISSUE-13: Subqueries in HAVING
>
> Consensus that this can be done as is with subqueries; no need to add
here. (kasei, axel, steveh, leef)
Mildly agree with mild worries as for subqueries in FILTERs.
> ** ISSUE-39: Variable scope of alias variables
>
> Consensus that variables on the right-hand side of "AS" (alias
variables) are not in scope for the rest of the query (including
projected expressions), but not including outer queries of course.
Disagree - this is an unnecessary restriction and results in needing
addition nesting of SELECTs just to reuse an expression.
Day 2:
> 1. To close ISSUE-47 by noting consensus on keeping MODIFY in the
Update language, modulo any concerns expressed by Update editors, no
objetions or abstentions link
Agree.
> 4. we'll have one update statement, DELETE ... INSERT ... WHERE ...,
where one of DELETE or INSERT may be ommitted, and WHERE is optional,
and multiple of these may be combined in a string using ";" as the
separator. link
I now prefer DELETE WHERE {}, that is, the pattern becomes the template.
This also means ";" is unnecessary. If a syntax requires the use of ";"
to distinguish two different forms, then I would be very worried (it's
going to be error prone).
Optional ";" is tolerable for convenience but it's used in Turtle with
an abbreviation meaning.
> 5. SPARQL Update WHERE clauses will be at least SPARQL 1.0 QUERY,
with each feature 1.1 adds to SPARQL Query being AT RISK for this. This
closes ISSUE-27. link
I think I know what you mean but this wording is not OK.
I prefer a framing of "SPARQL 1.1 Update uses SPARQL 1.1 Query; but, if
feedback is significant, the WG will define a profile using SPARQL 1.0
Query". i.e. default to SPARQL 1.1. Conformance would explicitly note
1.1 vs 1.0. Just 1.0 patterns is not fully compliant "SPARQL 1.1
Update" IMHO.
There are always going to be engines that are incomplete.
Received on Tuesday, 10 November 2009 17:20:22 UTC