- From: Andy Seaborne <andy.seaborne@talis.com>
- Date: Tue, 10 Nov 2009 17:20:04 +0000
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Here is my initial take on what appears to have been a successful face-to-face meeting. A lot has been moved forward. Lee asked for specific issues to be raised one per email thread so please change the subject if you reply to anything specifically. This message is my pass over all the points. I'll pull out specific issues if needed later. I reserve the right to change my mind :-) Andy From notes http://www.w3.org/2009/sparql/wiki/F2F2_Issue_Discussions and resolutions http://www.w3.org/2009/sparql/meeting/2009-11-03 Day 1: > ISSUE-11: Implicit grouping > > Consensus on prohibiting projecting variables/functions on > variables that are not included in the group by clause. Agreed. > ** ISSUE-12: HAVING vs. FILTER as keyword for limiting > aggregate results > > General consensus in favor of using "FILTER" as the keyword, > with bglimm preferring "HAVING". I prefer HAVING because familiarity with SQL. Having both is acceptable. > ** DISTINCT in aggregate functions > > Consensus on allowing DISTINCT with multiple arguments to aggregate > functions. DISTINCT in this case passes just the DISTINCT tuples > into the aggregate function (for each group). I'm unclear why it should be allowed in SUM or AVG. Is there a use case? We are already handling * differently by aggregate and DISNTINCT seems to only really man anything there. Are there specific motivating use cases? Is DISTINCT allowed in custom aggregates ? If so, they have different syntax. I propose that DISTINCT is not allowed for custom aggregates. An aggregate can choose to do that operation as part of it's definition but DISTINCT and not-DISTINCT forms are two different URI to name the aggregate. > ** Star (asterisk) in aggregate functions > > Consensus around only allowing asterisk as an argument to COUNT. Agreed - applies to anything that talks about rows, which is only COUNT. a custom aggregate can do this agg:count() if a custom aggregate is passed the solution, not the evaluated arguments for each element of a group. No args means the solution would work. The document needs to be clear on this. I don't have a use case but it would seem strange if you can't implement COUNT as a custom aggregate. > ** ISSUE-15: Syntax for custom aggregates > > Mild opinion in favor of having no keyword or special syntax > for custom aggregate functions Neutral, currently. > ** ISSUE-16: Mixed datatypes with built-in aggregates > > Consensus that MIN/MAX should use same semantics as ORDER BY, > with parts (e.g. ordering xsd:string and xsd:dateTime) being > undefined/implementation defined. I think this will get confusing with mixed data "1", "9", 1, 2, 3 but may be acceptable. (Multivaluespace handling is still my preferred design.) If eval failures, are "not in group", casting is OK but the document must talk about this. > Consensus that SUM/AVG should use same semantics as + Clarification: errors not in a group means that what would be 1 + error + 2 => 3 which is not the same as + > ** What happens with type errors that are projected? > > Consensus that type errors that are projected should result in that > solution being discarded. Agreed. > ** Trapping type errors? > > Consensus that COALESCE is a good way to trap errors. Agreed but the choice of word is now obscure at best. You can do a form of default values with this. > ** Do expressions always need to be aliased to a named variable? > Mild consensus that aliases should be required. Disagree mildly. Prefer to allow engine invent them. For results-on-wire, must be a legal variable. For API, who cares? > ** Syntax for expressions in SELECT list > > General lack of satisfaction with either: > > * Requiring commas if a projection uses at least one expression > * Wrapping expressions and aliases with parentheses (brackets) Would like to allow optional commas everywhere (SELECT, GROUP BY, ORDER BY). Prefer (?x +?y AS ?z) because some level of () are necessary for any expression in SPARQL to keep it parsable by a wide variety of approaches. So might as well include the AS. This is now the leading approach out there. > ** Sub-asks and sub-selects in FILTER > > General consensus (kasei, axel, steveh, leef) to avoid > the complexity of any subqueries in FILTERs. Agreed - the meaning of patterns (scoping of free variables) would need join-like semantics and is complex. The lack of scalar subSELECTs will be a potnetial area for consideration problem but is mitigated by having named variables in SPARQL. You can place the scalar select just be for the FILTER and AS the result into a variable. This is not an equivalence, the query pattern may be slight different, but you can get the effect as far as I can determine. Sub-Ask is not the same as (NOT) EXISTS because EXISTS isn't join-ed with other results. > ** Sub-constructs in FROM and FROM NAMED > > General consensus (kasei, steveh, leef) to avoid the complexity of sub-constructs in FROM. Axel is in favor, but willing to cede the point. Agreed. > ** ISSUE-13: Subqueries in HAVING > > Consensus that this can be done as is with subqueries; no need to add here. (kasei, axel, steveh, leef) Mildly agree with mild worries as for subqueries in FILTERs. > ** ISSUE-39: Variable scope of alias variables > > Consensus that variables on the right-hand side of "AS" (alias variables) are not in scope for the rest of the query (including projected expressions), but not including outer queries of course. Disagree - this is an unnecessary restriction and results in needing addition nesting of SELECTs just to reuse an expression. Day 2: > 1. To close ISSUE-47 by noting consensus on keeping MODIFY in the Update language, modulo any concerns expressed by Update editors, no objetions or abstentions link Agree. > 4. we'll have one update statement, DELETE ... INSERT ... WHERE ..., where one of DELETE or INSERT may be ommitted, and WHERE is optional, and multiple of these may be combined in a string using ";" as the separator. link I now prefer DELETE WHERE {}, that is, the pattern becomes the template. This also means ";" is unnecessary. If a syntax requires the use of ";" to distinguish two different forms, then I would be very worried (it's going to be error prone). Optional ";" is tolerable for convenience but it's used in Turtle with an abbreviation meaning. > 5. SPARQL Update WHERE clauses will be at least SPARQL 1.0 QUERY, with each feature 1.1 adds to SPARQL Query being AT RISK for this. This closes ISSUE-27. link I think I know what you mean but this wording is not OK. I prefer a framing of "SPARQL 1.1 Update uses SPARQL 1.1 Query; but, if feedback is significant, the WG will define a profile using SPARQL 1.0 Query". i.e. default to SPARQL 1.1. Conformance would explicitly note 1.1 vs 1.0. Just 1.0 patterns is not fully compliant "SPARQL 1.1 Update" IMHO. There are always going to be engines that are incomplete.
Received on Tuesday, 10 November 2009 17:20:22 UTC