- From: Axel Polleres <axel.polleres@deri.org>
- Date: Wed, 25 Aug 2010 18:15:28 +0100
- To: Andy Seaborne <andy.seaborne@epimorphics.com>, SPARQL Working Group <public-rdf-dawg@w3.org>
- Cc: Lee Feigenbaum <lee@thefigtrees.net>, Steve Harris <steve.harris@garlik.com>
Thanks for these very useful examples, Andy! (which I think brought me to another
imprecise formulation in the spec, I think)
Questions for clarification, to make sure everybody is on the same page here:
1)
> SELECT *
> {
> { SELECT ?x { ?x ?p ?o } GROUP BY ?x }
> ?o <p> 123 .
> }
Yup, we want to allow this, right?
2)
> SELECT (count(*) AS ?p) { ?s ?p ?o } GROUP BY ?s
...
> SELECT (SAMPLE(?p) AS ?p) { ?s ?p ?o } GROUP BY ?s
This is seemingly (but strangely enough not quite?) in conflict with:
"The new variable is introduced using the keyword AS; it must not already be potentially
bound."
I'd honestly prefer somehow to strenghten this restriction to:
"The new variable is introduced using the keyword AS; it must not already occur in the WHERE clause."
Funny enough, note that the original "potentially bound" formulation is problematic/imprecise already
without aggregates:
SELECT (?X as ?Y) WHERE { ?S ?P ?X OPTIONAL { ?S ?P ?Y FILTER(?Y != ?Y) } }
Obviously, there is no way that ?Y ever returns a binding by the FILTER expression...
so it is not "potentially bound" and that query would be syntactically ok, according to the definition.
I guess many will agree that checking static unsatisfiability of FILTER expressions would be a nightmare for parsers :-)
3)
> Personally, I'd be happy with forbidding the use variables of grouping
> expressions:
>
> SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable
> SELECT ?o WHERE { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable
Without expressing any strong opinion here: This rules out the new test case agg08, or, resp.,
turns it into a negativeSyntaxTest. I had assumed for the current version of agg08 that the
former would be allowed whereas the latter wouldn't. That's why I had "*or expressions*" in
my rewording proposal.
I assume what Andy means here (and which I think holds) is that we could forbid expressions
in Grouping alltogether, since they can be always emulated by subqueries, i.e.
SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } GROUP BY (1/(1-?o))
could be written without expression in the GROUP BY clause as:
SELECT ?o1 { SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } } GROUP BY ?o1 }
So, why not just doing just that and forbidding expressions in GROUP BY in the grammar already?
4) BTW, what about
SELECT * { ?s ?p ?o } GROUP BY ?s
Just to make sure everybody is on the same page here: is this also forbidden?
Thanks,
Axel
On 25 Aug 2010, at 16:37, Andy Seaborne wrote:
>
>
> On 25/08/10 13:33, Axel Polleres wrote:
> > In total, addressing 1) and 2) my current understanding is that we should change:
> >
> > "In aggregate queries and sub-queries variables that appear in the query
> > pattern, but are not grouped by cannot be projected nor used in project
> > expressions. In order to project arbitrary expressions the SAMPLE
> > aggregate may be used."
> >
> > -->
> >
> > "In aggregate queries and sub-queries variables *or expressions* that appear in the query
> > pattern, but are not grouped by cannot be projected, nor be used in project
> > expressions *(except within aggregations)*, *nor be used in HAVING clauses*.
> > In order to project arbitrary expressions the SAMPLE aggregate may be used."
> >
> > The formulation gets a bit heavier, but at least it seems clearer.
>
> Refining this:
>
> We need to forbid the *use* of ungrouped variables in the *specific
> SELECT* expression where the GROUP occurs. Otherwise:
>
> 1/ Use elsewhere in the query should be unaffected otherwise
>
> SELECT *
> {
> { SELECT ?x { ?x ?p ?o } GROUP BY ?x }
> ?o <p> 123 .
> }
>
> is illegal (it's a completely different ?o in the second use) which
> makes building queries by composition a nuisance.
>
> 2/ It's the undefined value of a non-key variable that's the issue
> because there isn't a clear value to give it.
>
> Introduction of an alias name is OK: this is being clear about the "use
> in expressions"(1/(1-?o) AS ?o1)
>
> SELECT (count(*) AS ?p) { ?s ?p ?o } GROUP BY ?s
>
> in extremis:
>
> SELECT (SAMPLE(?p) AS ?p) { ?s ?p ?o } GROUP BY ?s
>
> Personally, I'd be happy with forbidding the use variables of grouping
> expressions:
>
> SELECT (1/(1-?o) AS ?o1) { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable
> SELECT ?o WHERE { ?s ?p ?o } GROUP BY (1/(1-?o)) # Forbiddable
>
> [[
> In aggregate queries and sub-queries, variables that appear in the query
> pattern, but are not used to group the pattern, cannot be projected nor
> used in expressions in SELECT clause nor used in the expression of a
> HAVING clause of this query or sub-query unless they are part of an
> aggregate.
>
> They may be used as alias names.
>
> In order to project arbitrary expressions the SAMPLE aggregate may be used.
> ]]
>
> By saying "expressions" the use as alias names comes for free but it's
> clearer to say so.
>
> Andy
>
Received on Wednesday, 25 August 2010 17:16:04 UTC