- From: Axel Polleres <axel.polleres@deri.org>
- Date: Tue, 17 Apr 2007 18:42:50 +0100
- To: public-rdf-dawg-comments@w3.org
p.s.: Short clarification: I wrote these comments on your draft also on behalf of the RIF-WG, but unfortunately the group has not yet had a chance to review and discuss them, so for now please just take them as my personal comments. Hopefully, the RIF-WG will forward you additional comments, endorsed by the Working Group, in the next week or two. thanks, axel Axel Polleres wrote: > Dear all, > > below my review on the current SPARQL draft from > > http://www.w3.org/TR/rdf-sparql-query/ > > on behalf of W3C member organization DERI Galway. > > Generally, I think the formal definitions have improved a lot, but still > I am at the same time not 100% sure that all definitions are formally > water-proof. This affects mainly questions on Section 12 and partly > unclear Definitions/pseudocode algorithms for query evaluation therein. > > HTH, > best, > Axel > > > ------- > > Detailed comments: > > > Prefix notation is still not aligned with Turtle. Why? > Would it make sense to align with turtle and > use/allow '@prefix' instead/additionally to 'PREFIX' > You also have two ways of writing variables... so, why not? > > > Section 4.1.1 > > The single quote seems to be missing after the table in sec 4.1.1 > in "", or is this '"'? > > Section 4.1.4 > > The form > > [ :p "v" ] . > > looks very awkward to me! > > I don't find the grammar snippet for ANON very helpful here, without > explanation what WS is... shouldn't that be a PropertyListNotEmpty > instead? > > > Section 5 > > Section 5 is called Graph patterns and has only subsections > 5.1 and 5.2 for basic and group patterns, whereas the other types are > devoted separate top level sections.. this structuring seems a bit > unlogical. > > > Why the restriction that a blank node label can only be used in a single > basic graph pattern? And if so, isn't the remark that the scope is the > enclosing basic graph pattern redundant? > > Why here the section about "extending basic graph pattern matching", > when not even basic graph pattern matching has been properly introduced > yet? If you want to only informally introduce about what matching you > talk here, then I'd call section 5.1.2 simply "Basic Graph Pattern > Matching" but I think I'd rather suggest to drop this section. > > > > "with one solution requiring no bindings for variables" > --> > rather: > "with one solution producing no bindings for variables" > or: > "with one solution that does not bind any variables" > > Section 5.2.3 > > Why you have a separate subsection examples here? It seems > superfluous/repetitive. Just put the last example, which seems to be the > only new one, inside Sec 5.2.1 where it seems to fit, and drop the two > redundant ones. For the first one, you > could add "and thatbasic pattern consists of two triple patterns" to the > first example in sec 5.2, for the second one, add the remark that "the > FILTER does notbreak the basic graph pattern into two basic graph > patterns" to the respective exaple in section 5.2.2. > > > > Section 6: > > One overall question which I didn't sort out completely so far: > What if I mix OPTIONAL with FILTERs? > > ie. > > {A OPTIONAL B FILTER F OPTIONAL C} > > is that: > > {{A OPTIONAL B} FILTER F OPTIONAL C} > > or rather > > {{A OPTIONAL B FILTER F} OPTIONAL C} > > and: would it make a difference? I assume no, the filter is, in both > cases at the level of A, but I am not 100% sure. Maybe such an example > owuld be nice to have... > > > Another one about FILTERs: What about this one, ie. a FILTER which > refers to the outside scope: > > ?x p o OPTIONAL { FILTER (?x != s) } > > concrete example: > > SELECT ?n ?m > { ?x a foaf:Person . ?x foaf:name ?n . > OPTIONAL { ?x foaf:mbox ?m FILTER (?n != "John Doe") } } > > Supresses the email address for John Doe in the output! > Note: This one is interesting, since the OPTIONAL part may NOT be > evaluated separately!, but carries over a binding from the super-pattern! > > Do you have such an example in the testsuite? It seem that the last > example in Seciton 12.2.2 goes in this direction, more on that later > > Would it make sense to add some non-well-defined OPTIONAL patterns, > following [Perez et al. 2006] in the document? As mentioned before, I > didn't yet check section 12, maybe these corner case examples are there.. > > > Section 7: > > Why "unlike an OPTIONAL pattern"? This is comparing apples with pears... > I don't see the motivation for this comparison, I would suggest to > delete the part "unlike an OPTIONAL pattern". > > > as described in Querying the Dataset > --> > as described in Section 8.3 "Querying the Dataset" > > > Section 8 > > The example in section 8.2.3 uses GRAPH although GRAPH hasn't been > explained yet, either remove this section, start section 8.3 before, I > think GRAPH should be introduced before giving an example using it. > > <you may ignore this comment> > BTW: Would be cool to have a feature creating a merge from named graphs > as well... > > ie. I can't have something like > GRAPH g1 > GRAPH g2 { P } > > where the merge of g1 and g2 is taken for evaluating P. > whereas I can do this at the top level by several FROM clauses. > (Note this is rather a wish-list comment than a problem with the current > spec, probably, might be difficult to define in combination with > variables...) > </you may ignore this comment> > > Section 8.2.3 makes more sense after the 8.3 examples, and 8.3.2 is > simpler than 8.3.1, so, I'd suggest the order of subsections in 8.3 > > 8.3.2 > > 8.3.1 > > 8.3.3 > > 8.2.3 > > 8.3.4 (note that this example somewhat overlaps with what is shown in > 8.2.3 already, but fine to have both, i guess.) > > > > Section 9: > > What is "reduced" good for? I personally would tend to make reduced the > default, and instead put a modifier "STRICT" or "WITHDUPLICATES" which > enforces that ALL non-unique solutions are displayed. > > "Offset: control where the solutions start from in the overall solution > sequence." > > maybe it would be nice to add: "[...] in the overall solution sequence, > i.e., offset takes precedence over DISTINCT and REDUCED" > > at least, the formulation "in the overall solution sequence" would > suggest this... however, right afterwards you say: > "modifiers are applied in the order given by the list above"... this > seems somehow contradicting the "in the overall solution sequence", so > then you should modify this to: > "in the overall solution sequence, after application of solution > modidiers with higher precedence" and give an explicit precedence to > each solution modifier.... > > <you may ignore this comment> > BTW: Why is precendence of solution modifiers not simply the oRder in > which they are given in a query? wouldn't that be the simplest thing to do? > > ie. > > OFFSET 3 > DISTINCT > > would be different than > > DISTINCT > OFFSET 3 > > depending on the order. > Anyway, if you want to (which you probably do) stick with what you have > now, it would at least be easier to read if you'd take the suggestion > with explicit precedence levels for each modifier. > </you may ignore this comment> > > > Section 9.1 > > The ORDER BY construct allows arbitrary constraints/expressions as > parameter...ie. you could give an arbitrary constraint condition here, > right? What is the order of that? TRUE > FALSE? Would be good to add a > remark on that. > > I would put 'ASCENDING' and 'DESCENDING' in normal font, since it looks > like keaywords here, but since the respective keywords are ASC and DESC. > > Stupid Question: What is the "codepoint representation"? ... Since more > people might be stupid, maybe a reference is in order. > > > What is a "fixed, arbitrary order"??? Why not simply change > > "SPARQL provides a fixed, arbitrary order" > --> > "SPARQL fixes an order" > > and > > "This arbitrary order" > --> > "This order" > > I'd also move the sentence starting with "This order" after the > enumeration. > > > Note that, in the grammar for OrderCondition I think you could write it > maybe shorter: > > Wouldn't simply > orderCondition ::= ( 'ASC' | 'DESC' )? (Constraint | Var) > do? > > In the paragrpah above the Grammar snippet, you forgot the ASK result > form where ORDER BY also doesn't play a role, correct? > > Sec 9.2: > > Add somewhere in the prose: "using the SELECT result form"... > > It is actually a bit weird that you mix select into the solution > modifiers, IMO, it would be better to mention SELECT first in section 9 > and then introducing the solution modifiers. > > Sec 9.3: > > REDUCED also allows duplicates, or no? you mention before that reduced > only *permits* elimination of *some* duplicates... so, delete the "or > REDUCED" in the first sentence. > > > Sec9.4: > As for reduced as mentioned earlier, my personal feeling is that > REDUCED, or even DISTINCT should be the default, since it is less > committing, and I'd on the contrary put an alternative keyword "STRICT" > or "WITHDUPLICATES" which has the semantics that really ALL solutions > with ALL duplicates are given. My personal feeling is that > aggregates, which you mention in the "Warning" box, anyway only make > sense in connection with DISTINCT. Or you should include a good example > where not... > > Sec 9.5/9.6: > > OFFSET 0 has no effect, LIMIT 0 obviously makes no sense since the > answer is always the empty solution set... So why for both not simply > only allowing positive integers? I see no benefit in allowing 0 at all. > > Section 10: > > "query form" or "result form"? I'd suggest to use one of both consistently > and not switch. Personally, I'd prefer "result form"... > > Section 10.1 > > As for the overall structure, it might make sense to have the whole > section 10 before 9, since modifiers are anyway only important for > SELECT, and then you could skip the part on projection in section 9, as > SELECT is anyway not a solution modifier but a result form... > You should call it also "projection" in section 10.1, ie. what I suggest > is basically merging section 10.1 and 9.2. > > > Section 10.2 > > CONSTRUCT combines triples "by set union"? > So, I need to eliminate duplicate triples if I want to implement > CONSTRUCT in my SPARQL engine? > Is this really what you wanted? In case of doubt, I'd suggest to > remove "by set union", or respectively, analogously to SELECT, > introduce a DISTINCT (or alternatively a WITHDUPLICATES) > modifier for CONSTRUCT... > > BTW, I miss the semantics for CONSTRUCT given formally in Section 12. > > > Section 10.2.1 > > <you may ignore this comment> > What if I want a single blank node connecting all solutions? That would > be possible, if I could nest constructs in the FROM part... > </you may ignore this comment> > > > Section 10.2.3 > > Hmm, now you use order by, whereas you state before in Section 9.1 that > ORDER BY has no effect on CONSTRUCT... ah, I see, in combination with > LIMIT! > So, would it make sense in order to emphasize what you mean, to change > in section > 9.1 > > "Used in combination" > --> > "However, note that used in combination" > > 10.3/10.4 > > I think that ASK should be mentioned before the informative DESCRIBE, > thus I suggest to swap these two sections. > > Section 11 > > - Any changes in the FILTER handling from the last version? Is there a > changelog? > - As mentioned earlier, I am a bit puzzled about the "evaluation" of > Constraints given as an argument to ORDER BY especially since there you > don't want to take the EBV but the actual value to order the solutions. > (Note that what it means that a solution sequence "satisfies an order > condition" is also not really formally defined in Section 12!) > > Apart from that, did not check the section in all detail again since it > seems to be similar to the prev. version , but some comments still: > > "equivilence"? > Do you mean equivalence? My dictionary doesn't know that word. > > The codepoint reference should already be given earlier, as mentioned > above. > > > Section 11.3.1 > > The operator extensibility makes me a bit worried as for the > nonmonotonic behavior of '! bound': > In combination with '! bound', does it still hold that > "SPARQL extensions" will produce at least the same solutions as an > unextended implementations and may for some queries, produce more > solutions... I have an unease feeling here, though not substantiated by > proof/counterexample. > > > Section 12 : > > 12.1.1 > > > Is the necessity that the u_i's are distinct in the dataset really > important? > Why not also define the data corresponding to the respective URI as > graph merge then, like the default graph? > > > 12.2 > > The two tables suggests there is a corellation between the patterns and > modifiers appearing in the same line of the table, which is not the case. > > Also, why in the first table is RDF Terms and triple patterns in one > line and not separate? > > Why do you write > FILTER(Expression) > but not > ORDER BY (Expression) > as the syntax suggests? > > Moreover, the tables should be numbered. > > You use the abbreviation BGP for Basic graph pattern first in the second > table which wasn't introduced. Actually, it would be more intuitive, if > you'd use actually *symbols* for your algebra, like e.g. the ones from > traditional Relational Algebra, as was done in [Perez et al. 2006]. > > "The result of converting such an abstract syntax tree is a SPARQL query > that uses these symbols in the SPARQL algebra:" > --> > "The result of converting such an abstract syntax tree is a SPARQL query > that uses the following symbols in the SPARQL algebra:" > or maybe even better: > "The result of converting such an abstract syntax tree is a SPARQL query > that uses the symbols introduced in Table 2 in the SPARQL algebra:" > > What is "ToList"? > > 12.2.1 > > The steps here refer to the grammar? > The steps obviously take the parse tree nodes of the grammar as the > basis... > anyway this is neither explained nor entirely clear. > > then connected with 'UNION' > --> > connected with 'UNION' > > What do you mean by > > "We introduce the following symbols:" > > 1) what you define here is not 'symbols' > 2) This doesn't seem to be a proper definition but just a bullet > list without further explanation. > > as said before, the symbols, should indeed be symbols and be defined > properly in section 12.2 with the tables, in my opinion. > > The algorithm for the transformation is a bit confusing, IMO. It seems > to be pseudo-code for a recursive algorithm, but it is not clear where > there are recursive calls. > > Is the observation correct that in this algebra (following the algorithm) > > A OPTIONAL {B FILTER F} > > would be the same as > > A FILTER F OPTIONAL {B} > > ??? > > ie, both result in: > > LeftJoin(A,B,F) > > That is not necessarily intuitive in my opinion. > Take the concrete exampe from above: > > SELECT ?n ?m > { ?x a foaf:Person . ?x foaf:name ?n . > OPTIONAL { ?x foaf:mbox ?m FILTER (?n != "John Doe") } } > > As I said, in my understanding, this query could be used to supress > email addresses for a particular name, whereas the algorithm suggests > that this is just the same as writing: > > SELECT ?n ?m > { ?x a foaf:Person . ?x foaf:name ?n . FILTER (?n != "John Doe") > OPTIONAL { ?x foaf:mbox ?m } } > > Is this intended? If yes, the last example of section 12.2.2 is wrong. > > BTW: If so, it seems that the whole second part of the algorithm can be > simplified to: -- Dr. Axel Polleres email: axel@polleres.net url: http://www.polleres.net/
Received on Tuesday, 17 April 2007 17:43:02 UTC