- From: Axel Polleres <axel.polleres@deri.org>
- Date: Thu, 26 Aug 2010 15:46:03 +0100
- To: "Andy Seaborne" <andy.seaborne@epimorphics.com>
- Cc: "SPARQL Working Group" <public-rdf-dawg@w3.org>, "Lee Feigenbaum" <lee@thefigtrees.net>, "Steve Harris" <steve.harris@garlik.com>
On 26 Aug 2010, at 14:48, Andy Seaborne wrote: > > > On 25/08/10 22:24, Axel Polleres wrote: > >>> Any opinions on this? This actually worries me about the current "potentially bound" wording. > >> > >> If we want a static analysis of the query, then regard ?Y as potentially > >> bound. > > > > We'd need to explain/define the exact reading of "regard as potentially bound". As it stands it is unclear. My example *could* be detected by static analysis, if the static analyser was able to detect *statically* unsatisfiable FILTER expressions, such as (?Y != ?Y) , so it is not clear why ?Y should be regarded as as potentially bound in my example. I am fine with any formulation which has a clear definition, the current wording unfortunately does not. > > In general, analysing expressions for non-satisfiability is not > practical. we are in wild agreement on this! ... > Only some simple cases like you example are possible as are > other forms optimizing compilers might notice. Once reordering and > equivalence are added, the complexity cost grows. And what about > structural invariances like FILTER(?Y>45 && isIRI(?Y)) or > data-introduced data-introduced FILTER(?ageInYear < -10)? > > A practical scheme is based on sites where variables are bound in BGP > within patterns. > > This relates to the GROUP BY and expression handling. Detecting when > one expression (in the SELECT line) uses another (from the GROUP BY) is > complex because complex expressions can be written in different ways yet > be equivalent expressions. > > Therefore, I suggest we do not requite implementations to be able to > perform such comparisons. ... and this! > > My outline definition of potentially bound is a practical algorithm > based on just the points where a variable can be bound (not filtered > out). There are only a few places where terms can be bound to variables. Ok, my main point was that we'd need to have this defined in the spec clearly, which currently isn't the case. > > My proposal for rewording was maybe too restrictive, but it was clearly checkable statically. > > BTW, the current wording for "SELECT *" is equally ambiguous. > > > > [...] > > > >>>> Unnecessarily severe. > >>> > >>> Fair enough, if we can afford it. Though it seems that expressions in GROUP BY are strictly speaking not necessary, and seem to be replaceable quite easily, so I wouldn't consider this restriction severe. > >> > >> It's severe because it's the corner case driving the main design. And > >> you were arguing for shorter syntax. > > > > Yes, but your version leaves us with something very restricted, it seems. > > you say you'd disallow agg08... > > Why is it "very restricted"? You seem to restrict more than I would. > > It's a restriction but I don't see it as /very/ restricting, especially > as you have already shown that if the app needs the value of the > grouping returned it can do so using a nested SELECT. > > The balance is the difficulty of determining whether one expression is a > sub-expression of another, including reordering and rewriting. > > Consider > > GROUP BY (1/?o) > > then > > SELECT (fn:floor(1/(-2*?o))+count(*))) Sure, but I had maent to allow only the *exact same* expression as the grouped expression as subexpression. > is theoretically safe. When two or more variables are involved, it gets > complicated. > > >> agg08 uses an expression for GROUP BY. I am suggesting, as a > >> simplification, that it does not put ?O1 and ?O2, not (?O1+?O2), as > >> legal uses in an expression in the SELECT clause. > > > > That would be a quite different query, wouldn't it? Can you show me what exactly your simplification means for the agg08 query? > > agg08 would be an error because it uses variables in an expression which > are not key variables of the group. > > Let me try to understand again what you propose: > > - you want to allow only grouped variables being projected or used in project expressions > > Yes, understanding "grouped variables" as variables used in GROUP BY, > but not in an expression. > > > - you additionally want to allow grouping by expressions, but the grouped expressions are not reusable in the SELECT clause. > > yes? > > Yes. I'm following the current doc which allows grouping by expression > (syntax and definition reading ExprList as a list of expressions). > > Group(ExprList, Ω) = > { ListEval(ExprList, μ) -> > { μ' | μ' in Ω, ListEval(ExprList, μ) = ListEval(ExprList, μ') } > | μ in Ω } > > > If so, it seems our arguments run a bit past each other... > > You seem to propagate a stronger restriction than me for GROUPing, but a weaker restriction than mine for variables allowed as names in project expressions? > > I suggest a stronger restriction on the variables allowed in project > expressions (and projections and HAVING) in that it only considers > variables. This is because of the complexity of determining whether one > expression is "safe" given an expression used for GROUP BY. > > Otherwise we are trying to allow: > > SELECT (?o1+?o2 AS ?o3) ... GROUP BY (?o2+?o1) > SELECT (1/(?o1+?o2) AS ?o3) ... GROUP BY (?o2+?o1) > Would've been allowed in my current understanding, but am not religious about it. > unclear about: > SELECT (fn:floor(2*?o1+2*?o2)) AS ?o3) ... GROUP BY (?o2+?o1) > not allowed in my understanding > but not > SELECT ?o1 ... GROUP (?o2+?o1) clearly not allowed in my understanding (for obvious reasons... different ?o1 values can contribute to the same (?o2+?o1) values, actually that is what agg08 should demonstrate. > > I don't think that removing the possibility of GROUP BY with an > expression would be particularly serious; however, there is no reason to > forbid it (the issue is expressions in SELECT with constant value within > a group, not the GROUP BY clause) and it is in the current draft. > > I'm not sure what you propose. You have mentioned no expressions in > GROUP BY and also allowing reuse of the same expression used in the > GROUP BY in the select expressions. yes. > For the latter, I haven't seen what > equivalence of expressions, syntactical equivalence, but... > We could be consistent with SELECT expressions and go so far as to require the AS is an expression is used. ... that sounds reasonable to me as well. Axel >
Received on Thursday, 26 August 2010 14:46:37 UTC