- From: Axel Polleres <axel.polleres@deri.org>
- Date: Wed, 25 Aug 2010 18:15:28 +0100
- To: Andy Seaborne <andy.seaborne@epimorphics.com>, SPARQL Working Group <public-rdf-dawg@w3.org>
- Cc: Lee Feigenbaum <lee@thefigtrees.net>, Steve Harris <steve.harris@garlik.com>
Thanks for these very useful examples, Andy! (which I think brought me to another imprecise formulation in the spec, I think) Questions for clarification, to make sure everybody is on the same page here: 1) > SELECT * > { > { SELECT ?x { ?x ?p ?o } GROUP BY ?x } > ?o <p> 123 . > } Yup, we want to allow this, right? 2) > SELECT (count(*) AS ?p) { ?s ?p ?o } GROUP BY ?s ... > SELECT (SAMPLE(?p) AS ?p) { ?s ?p ?o } GROUP BY ?s This is seemingly (but strangely enough not quite?) in conflict with: "The new variable is introduced using the keyword AS; it must not already be potentially bound." I'd honestly prefer somehow to strenghten this restriction to: "The new variable is introduced using the keyword AS; it must not already occur in the WHERE clause." Funny enough, note that the original "potentially bound" formulation is problematic/imprecise already without aggregates: SELECT (?X as ?Y) WHERE { ?S ?P ?X OPTIONAL { ?S ?P ?Y FILTER(?Y != ?Y) } } Obviously, there is no way that ?Y ever returns a binding by the FILTER expression... so it is not "potentially bound" and that query would be syntactically ok, according to the definition. I guess many will agree that checking static unsatisfiability of FILTER expressions would be a nightmare for parsers :-) 3) > Personally, I'd be happy with forbidding the use variables of grouping > expressions: > > SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable > SELECT ?o WHERE { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable Without expressing any strong opinion here: This rules out the new test case agg08, or, resp., turns it into a negativeSyntaxTest. I had assumed for the current version of agg08 that the former would be allowed whereas the latter wouldn't. That's why I had "*or expressions*" in my rewording proposal. I assume what Andy means here (and which I think holds) is that we could forbid expressions in Grouping alltogether, since they can be always emulated by subqueries, i.e. SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } GROUP BY (1/(1-?o)) could be written without expression in the GROUP BY clause as: SELECT ?o1 { SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } } GROUP BY ?o1 } So, why not just doing just that and forbidding expressions in GROUP BY in the grammar already? 4) BTW, what about SELECT * { ?s ?p ?o } GROUP BY ?s Just to make sure everybody is on the same page here: is this also forbidden? Thanks, Axel On 25 Aug 2010, at 16:37, Andy Seaborne wrote: > > > On 25/08/10 13:33, Axel Polleres wrote: > > In total, addressing 1) and 2) my current understanding is that we should change: > > > > "In aggregate queries and sub-queries variables that appear in the query > > pattern, but are not grouped by cannot be projected nor used in project > > expressions. In order to project arbitrary expressions the SAMPLE > > aggregate may be used." > > > > --> > > > > "In aggregate queries and sub-queries variables *or expressions* that appear in the query > > pattern, but are not grouped by cannot be projected, nor be used in project > > expressions *(except within aggregations)*, *nor be used in HAVING clauses*. > > In order to project arbitrary expressions the SAMPLE aggregate may be used." > > > > The formulation gets a bit heavier, but at least it seems clearer. > > Refining this: > > We need to forbid the *use* of ungrouped variables in the *specific > SELECT* expression where the GROUP occurs. Otherwise: > > 1/ Use elsewhere in the query should be unaffected otherwise > > SELECT * > { > { SELECT ?x { ?x ?p ?o } GROUP BY ?x } > ?o <p> 123 . > } > > is illegal (it's a completely different ?o in the second use) which > makes building queries by composition a nuisance. > > 2/ It's the undefined value of a non-key variable that's the issue > because there isn't a clear value to give it. > > Introduction of an alias name is OK: this is being clear about the "use > in expressions"(1/(1-?o) AS ?o1) > > SELECT (count(*) AS ?p) { ?s ?p ?o } GROUP BY ?s > > in extremis: > > SELECT (SAMPLE(?p) AS ?p) { ?s ?p ?o } GROUP BY ?s > > Personally, I'd be happy with forbidding the use variables of grouping > expressions: > > SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable > SELECT ?o WHERE { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable > > [[ > In aggregate queries and sub-queries, variables that appear in the query > pattern, but are not used to group the pattern, cannot be projected nor > used in expressions in SELECT clause nor used in the expression of a > HAVING clause of this query or sub-query unless they are part of an > aggregate. > > They may be used as alias names. > > In order to project arbitrary expressions the SAMPLE aggregate may be used. > ]] > > By saying "expressions" the use as alias names comes for free but it's > clearer to say so. > > Andy >
Received on Wednesday, 25 August 2010 17:16:04 UTC