- From: Axel Polleres <axel.polleres@deri.org>
- Date: Tue, 17 Apr 2007 18:42:50 +0100
- To: public-rdf-dawg-comments@w3.org
p.s.: Short clarification: I wrote these comments on your draft also on
behalf of the RIF-WG, but unfortunately the group has not yet had a
chance to review and discuss them, so for now please just take them as
my personal
comments. Hopefully, the RIF-WG will forward you additional comments,
endorsed by the Working Group, in the next week or two.
thanks,
axel
Axel Polleres wrote:
> Dear all,
>
> below my review on the current SPARQL draft from
>
> http://www.w3.org/TR/rdf-sparql-query/
>
> on behalf of W3C member organization DERI Galway.
>
> Generally, I think the formal definitions have improved a lot, but still
> I am at the same time not 100% sure that all definitions are formally
> water-proof. This affects mainly questions on Section 12 and partly
> unclear Definitions/pseudocode algorithms for query evaluation therein.
>
> HTH,
> best,
> Axel
>
>
> -------
>
> Detailed comments:
>
>
> Prefix notation is still not aligned with Turtle. Why?
> Would it make sense to align with turtle and
> use/allow '@prefix' instead/additionally to 'PREFIX'
> You also have two ways of writing variables... so, why not?
>
>
> Section 4.1.1
>
> The single quote seems to be missing after the table in sec 4.1.1
> in "", or is this '"'?
>
> Section 4.1.4
>
> The form
>
> [ :p "v" ] .
>
> looks very awkward to me!
>
> I don't find the grammar snippet for ANON very helpful here, without
> explanation what WS is... shouldn't that be a PropertyListNotEmpty
> instead?
>
>
> Section 5
>
> Section 5 is called Graph patterns and has only subsections
> 5.1 and 5.2 for basic and group patterns, whereas the other types are
> devoted separate top level sections.. this structuring seems a bit
> unlogical.
>
>
> Why the restriction that a blank node label can only be used in a single
> basic graph pattern? And if so, isn't the remark that the scope is the
> enclosing basic graph pattern redundant?
>
> Why here the section about "extending basic graph pattern matching",
> when not even basic graph pattern matching has been properly introduced
> yet? If you want to only informally introduce about what matching you
> talk here, then I'd call section 5.1.2 simply "Basic Graph Pattern
> Matching" but I think I'd rather suggest to drop this section.
>
>
>
> "with one solution requiring no bindings for variables"
> -->
> rather:
> "with one solution producing no bindings for variables"
> or:
> "with one solution that does not bind any variables"
>
> Section 5.2.3
>
> Why you have a separate subsection examples here? It seems
> superfluous/repetitive. Just put the last example, which seems to be the
> only new one, inside Sec 5.2.1 where it seems to fit, and drop the two
> redundant ones. For the first one, you
> could add "and thatbasic pattern consists of two triple patterns" to the
> first example in sec 5.2, for the second one, add the remark that "the
> FILTER does notbreak the basic graph pattern into two basic graph
> patterns" to the respective exaple in section 5.2.2.
>
>
>
> Section 6:
>
> One overall question which I didn't sort out completely so far:
> What if I mix OPTIONAL with FILTERs?
>
> ie.
>
> {A OPTIONAL B FILTER F OPTIONAL C}
>
> is that:
>
> {{A OPTIONAL B} FILTER F OPTIONAL C}
>
> or rather
>
> {{A OPTIONAL B FILTER F} OPTIONAL C}
>
> and: would it make a difference? I assume no, the filter is, in both
> cases at the level of A, but I am not 100% sure. Maybe such an example
> owuld be nice to have...
>
>
> Another one about FILTERs: What about this one, ie. a FILTER which
> refers to the outside scope:
>
> ?x p o OPTIONAL { FILTER (?x != s) }
>
> concrete example:
>
> SELECT ?n ?m
> { ?x a foaf:Person . ?x foaf:name ?n .
> OPTIONAL { ?x foaf:mbox ?m FILTER (?n != "John Doe") } }
>
> Supresses the email address for John Doe in the output!
> Note: This one is interesting, since the OPTIONAL part may NOT be
> evaluated separately!, but carries over a binding from the super-pattern!
>
> Do you have such an example in the testsuite? It seem that the last
> example in Seciton 12.2.2 goes in this direction, more on that later
>
> Would it make sense to add some non-well-defined OPTIONAL patterns,
> following [Perez et al. 2006] in the document? As mentioned before, I
> didn't yet check section 12, maybe these corner case examples are there..
>
>
> Section 7:
>
> Why "unlike an OPTIONAL pattern"? This is comparing apples with pears...
> I don't see the motivation for this comparison, I would suggest to
> delete the part "unlike an OPTIONAL pattern".
>
>
> as described in Querying the Dataset
> -->
> as described in Section 8.3 "Querying the Dataset"
>
>
> Section 8
>
> The example in section 8.2.3 uses GRAPH although GRAPH hasn't been
> explained yet, either remove this section, start section 8.3 before, I
> think GRAPH should be introduced before giving an example using it.
>
> <you may ignore this comment>
> BTW: Would be cool to have a feature creating a merge from named graphs
> as well...
>
> ie. I can't have something like
> GRAPH g1
> GRAPH g2 { P }
>
> where the merge of g1 and g2 is taken for evaluating P.
> whereas I can do this at the top level by several FROM clauses.
> (Note this is rather a wish-list comment than a problem with the current
> spec, probably, might be difficult to define in combination with
> variables...)
> </you may ignore this comment>
>
> Section 8.2.3 makes more sense after the 8.3 examples, and 8.3.2 is
> simpler than 8.3.1, so, I'd suggest the order of subsections in 8.3
>
> 8.3.2
>
> 8.3.1
>
> 8.3.3
>
> 8.2.3
>
> 8.3.4 (note that this example somewhat overlaps with what is shown in
> 8.2.3 already, but fine to have both, i guess.)
>
>
>
> Section 9:
>
> What is "reduced" good for? I personally would tend to make reduced the
> default, and instead put a modifier "STRICT" or "WITHDUPLICATES" which
> enforces that ALL non-unique solutions are displayed.
>
> "Offset: control where the solutions start from in the overall solution
> sequence."
>
> maybe it would be nice to add: "[...] in the overall solution sequence,
> i.e., offset takes precedence over DISTINCT and REDUCED"
>
> at least, the formulation "in the overall solution sequence" would
> suggest this... however, right afterwards you say:
> "modifiers are applied in the order given by the list above"... this
> seems somehow contradicting the "in the overall solution sequence", so
> then you should modify this to:
> "in the overall solution sequence, after application of solution
> modidiers with higher precedence" and give an explicit precedence to
> each solution modifier....
>
> <you may ignore this comment>
> BTW: Why is precendence of solution modifiers not simply the oRder in
> which they are given in a query? wouldn't that be the simplest thing to do?
>
> ie.
>
> OFFSET 3
> DISTINCT
>
> would be different than
>
> DISTINCT
> OFFSET 3
>
> depending on the order.
> Anyway, if you want to (which you probably do) stick with what you have
> now, it would at least be easier to read if you'd take the suggestion
> with explicit precedence levels for each modifier.
> </you may ignore this comment>
>
>
> Section 9.1
>
> The ORDER BY construct allows arbitrary constraints/expressions as
> parameter...ie. you could give an arbitrary constraint condition here,
> right? What is the order of that? TRUE > FALSE? Would be good to add a
> remark on that.
>
> I would put 'ASCENDING' and 'DESCENDING' in normal font, since it looks
> like keaywords here, but since the respective keywords are ASC and DESC.
>
> Stupid Question: What is the "codepoint representation"? ... Since more
> people might be stupid, maybe a reference is in order.
>
>
> What is a "fixed, arbitrary order"??? Why not simply change
>
> "SPARQL provides a fixed, arbitrary order"
> -->
> "SPARQL fixes an order"
>
> and
>
> "This arbitrary order"
> -->
> "This order"
>
> I'd also move the sentence starting with "This order" after the
> enumeration.
>
>
> Note that, in the grammar for OrderCondition I think you could write it
> maybe shorter:
>
> Wouldn't simply
> orderCondition ::= ( 'ASC' | 'DESC' )? (Constraint | Var)
> do?
>
> In the paragrpah above the Grammar snippet, you forgot the ASK result
> form where ORDER BY also doesn't play a role, correct?
>
> Sec 9.2:
>
> Add somewhere in the prose: "using the SELECT result form"...
>
> It is actually a bit weird that you mix select into the solution
> modifiers, IMO, it would be better to mention SELECT first in section 9
> and then introducing the solution modifiers.
>
> Sec 9.3:
>
> REDUCED also allows duplicates, or no? you mention before that reduced
> only *permits* elimination of *some* duplicates... so, delete the "or
> REDUCED" in the first sentence.
>
>
> Sec9.4:
> As for reduced as mentioned earlier, my personal feeling is that
> REDUCED, or even DISTINCT should be the default, since it is less
> committing, and I'd on the contrary put an alternative keyword "STRICT"
> or "WITHDUPLICATES" which has the semantics that really ALL solutions
> with ALL duplicates are given. My personal feeling is that
> aggregates, which you mention in the "Warning" box, anyway only make
> sense in connection with DISTINCT. Or you should include a good example
> where not...
>
> Sec 9.5/9.6:
>
> OFFSET 0 has no effect, LIMIT 0 obviously makes no sense since the
> answer is always the empty solution set... So why for both not simply
> only allowing positive integers? I see no benefit in allowing 0 at all.
>
> Section 10:
>
> "query form" or "result form"? I'd suggest to use one of both consistently
> and not switch. Personally, I'd prefer "result form"...
>
> Section 10.1
>
> As for the overall structure, it might make sense to have the whole
> section 10 before 9, since modifiers are anyway only important for
> SELECT, and then you could skip the part on projection in section 9, as
> SELECT is anyway not a solution modifier but a result form...
> You should call it also "projection" in section 10.1, ie. what I suggest
> is basically merging section 10.1 and 9.2.
>
>
> Section 10.2
>
> CONSTRUCT combines triples "by set union"?
> So, I need to eliminate duplicate triples if I want to implement
> CONSTRUCT in my SPARQL engine?
> Is this really what you wanted? In case of doubt, I'd suggest to
> remove "by set union", or respectively, analogously to SELECT,
> introduce a DISTINCT (or alternatively a WITHDUPLICATES)
> modifier for CONSTRUCT...
>
> BTW, I miss the semantics for CONSTRUCT given formally in Section 12.
>
>
> Section 10.2.1
>
> <you may ignore this comment>
> What if I want a single blank node connecting all solutions? That would
> be possible, if I could nest constructs in the FROM part...
> </you may ignore this comment>
>
>
> Section 10.2.3
>
> Hmm, now you use order by, whereas you state before in Section 9.1 that
> ORDER BY has no effect on CONSTRUCT... ah, I see, in combination with
> LIMIT!
> So, would it make sense in order to emphasize what you mean, to change
> in section
> 9.1
>
> "Used in combination"
> -->
> "However, note that used in combination"
>
> 10.3/10.4
>
> I think that ASK should be mentioned before the informative DESCRIBE,
> thus I suggest to swap these two sections.
>
> Section 11
>
> - Any changes in the FILTER handling from the last version? Is there a
> changelog?
> - As mentioned earlier, I am a bit puzzled about the "evaluation" of
> Constraints given as an argument to ORDER BY especially since there you
> don't want to take the EBV but the actual value to order the solutions.
> (Note that what it means that a solution sequence "satisfies an order
> condition" is also not really formally defined in Section 12!)
>
> Apart from that, did not check the section in all detail again since it
> seems to be similar to the prev. version , but some comments still:
>
> "equivilence"?
> Do you mean equivalence? My dictionary doesn't know that word.
>
> The codepoint reference should already be given earlier, as mentioned
> above.
>
>
> Section 11.3.1
>
> The operator extensibility makes me a bit worried as for the
> nonmonotonic behavior of '! bound':
> In combination with '! bound', does it still hold that
> "SPARQL extensions" will produce at least the same solutions as an
> unextended implementations and may for some queries, produce more
> solutions... I have an unease feeling here, though not substantiated by
> proof/counterexample.
>
>
> Section 12 :
>
> 12.1.1
>
>
> Is the necessity that the u_i's are distinct in the dataset really
> important?
> Why not also define the data corresponding to the respective URI as
> graph merge then, like the default graph?
>
>
> 12.2
>
> The two tables suggests there is a corellation between the patterns and
> modifiers appearing in the same line of the table, which is not the case.
>
> Also, why in the first table is RDF Terms and triple patterns in one
> line and not separate?
>
> Why do you write
> FILTER(Expression)
> but not
> ORDER BY (Expression)
> as the syntax suggests?
>
> Moreover, the tables should be numbered.
>
> You use the abbreviation BGP for Basic graph pattern first in the second
> table which wasn't introduced. Actually, it would be more intuitive, if
> you'd use actually *symbols* for your algebra, like e.g. the ones from
> traditional Relational Algebra, as was done in [Perez et al. 2006].
>
> "The result of converting such an abstract syntax tree is a SPARQL query
> that uses these symbols in the SPARQL algebra:"
> -->
> "The result of converting such an abstract syntax tree is a SPARQL query
> that uses the following symbols in the SPARQL algebra:"
> or maybe even better:
> "The result of converting such an abstract syntax tree is a SPARQL query
> that uses the symbols introduced in Table 2 in the SPARQL algebra:"
>
> What is "ToList"?
>
> 12.2.1
>
> The steps here refer to the grammar?
> The steps obviously take the parse tree nodes of the grammar as the
> basis...
> anyway this is neither explained nor entirely clear.
>
> then connected with 'UNION'
> -->
> connected with 'UNION'
>
> What do you mean by
>
> "We introduce the following symbols:"
>
> 1) what you define here is not 'symbols'
> 2) This doesn't seem to be a proper definition but just a bullet
> list without further explanation.
>
> as said before, the symbols, should indeed be symbols and be defined
> properly in section 12.2 with the tables, in my opinion.
>
> The algorithm for the transformation is a bit confusing, IMO. It seems
> to be pseudo-code for a recursive algorithm, but it is not clear where
> there are recursive calls.
>
> Is the observation correct that in this algebra (following the algorithm)
>
> A OPTIONAL {B FILTER F}
>
> would be the same as
>
> A FILTER F OPTIONAL {B}
>
> ???
>
> ie, both result in:
>
> LeftJoin(A,B,F)
>
> That is not necessarily intuitive in my opinion.
> Take the concrete exampe from above:
>
> SELECT ?n ?m
> { ?x a foaf:Person . ?x foaf:name ?n .
> OPTIONAL { ?x foaf:mbox ?m FILTER (?n != "John Doe") } }
>
> As I said, in my understanding, this query could be used to supress
> email addresses for a particular name, whereas the algorithm suggests
> that this is just the same as writing:
>
> SELECT ?n ?m
> { ?x a foaf:Person . ?x foaf:name ?n . FILTER (?n != "John Doe")
> OPTIONAL { ?x foaf:mbox ?m } }
>
> Is this intended? If yes, the last example of section 12.2.2 is wrong.
>
> BTW: If so, it seems that the whole second part of the algorithm can be
> simplified to:
--
Dr. Axel Polleres
email: axel@polleres.net url: http://www.polleres.net/
Received on Tuesday, 17 April 2007 17:43:02 UTC