- From: Howard Katz <howardk@fatdog.com>
- Date: Mon, 4 Oct 2004 08:30:17 -0700
- To: "Dave Beckett" <dave.beckett@bristol.ac.uk>, "Dan Connolly" <connolly@w3.org>
- Cc: "Andy Seaborne" <andy.seaborne@hp.com>, "Steve Harris" <S.W.Harris@ecs.soton.ac.uk>, "RDF Data Access Working Group" <public-rdf-dawg@w3.org>
Hi all, I'll be spending time with the document today and getting comments back to you later today. Ta, Howard > -----Original Message----- > From: Dave Beckett [mailto:dave.beckett@bristol.ac.uk] > Sent: Monday, October 04, 2004 7:14 AM > To: Dan Connolly > Cc: Andy Seaborne; Dave Beckett; Howard Katz; Steve Harris; RDF Data > Access Working Group > Subject: Re: SPARQL / Language spec ready for review > > > Reviewing > > http://www.w3.org/2001/sw/DataAccess/rq23/ > > $Log: Overview.html,v $ > Revision 1.77 2004/10/03 13:06:28 eric > > Completed. I'll now take a look at 1.77->1.79 changes > > > Dave > > > Items that I think must be fixed before publication > --------------------------------------------------- > > See also MUSTFIX in detailed notes below. Summarising: > > * First sentence in 1. Introduction is wrong. RDF is a set of triples. > > * Consistency in use of individuals, sets of individuals examples: > b in B used ok however T defined as a set and used as a member of > that set, also defined as tp. T in GP should be tp in GP. > > See comments on definitions of Triple Pattern, Triple Pattern > Matching, Graph Pattern, Graph Pattern Matching > > * Initial Binding definition baffles me, I need more explanation. > > > General Comments > ---------------- > > A thorough spellcheck is needed. > > Label all examples with Numbers, titles and add anchors. > Add all example queries, data files as separate files with URIs, link > to them. Add them to the test suite. > Add labels and anchors to all definitions. > > Do not use underlining in the html style when it isn't a link. > > In query results, some of the tables use ?x and some use bare x. > Some results use both! > > Suggest global s/<tt>OPTIONAL</tt>/optional/ since the OPTIONAL > keyword is never explained in the document and only appears in the > grammar. > > > Detailed Comments > ------------------ > These should be fixed but are not critical. > > > Title: SPARQL title does not mention protocol despite the 'P' in the > name. > > Later on the document suggests that protocol is a separate document. > > > Abstract > typo: "end users [missing words] to write" > > ToC > missing 4.3 > 8 "Chosing What to Query" to match document capitals > 12.2 ditto > Appendices labelled 1,2 actually A, B in doc > > suggest removing see also, old material. It's not ToC. > > > 1 Introduction > > MUSTFIX: First sentence is wrong. > > The abstract syntax for RDF is not a "graph of nodes and arcs, often > expressed as triples". It is a set of triples called an RDF graph > formally defined in RDF semantics. It can be and is often described > as a graph of nodes and arcs but RDF is not nodes+arcs; that was an > RDF core decision closely argued. > > preference to graph "created dynamically" than "partly calculated > on demand" > > > (un-numbered section) Document Outline > @@variables bound@@, @@bindings@@ can be linked to forward references > > "10 - Summary" doesn't match the style of the other paragraphs - no > explanation > > > 2 Making Simple Patterns > last sentence preference to "[Simple] patterns can be ..." > > [All graph pictures are unreadable when printed out, too dark. > Please re-compose on a light background or with much greater > contrast. black on gray doesn't work.] > > First example. I suggest not using _:1 _:2 since it's not legal in > N3, Turtle, N-Triples for blank node labels. I think a small edit > can make the first example executable, testable. > > I'd prefer full names for variables, for easy of readability > especially by non-native english speakers. So 'address' not 'addr' > and something else instead of 'addrm' > > > 2.1 > P2 > URIref expand to URI Reference for first use. Or use the > correct definition RDF URI Reference and link to it. > grammar - "XML. Qname" - delete the "." > Link to QName in XML sepcs. > datatype URIRef not URI > > Para "Because.." > here and later I see "URIs used" - check for consistency. I suggest > s/URI/URIref/ throughout > > N3/Turtle used without a reference, explanation. > > Spellings "intpretted" > > Para "Prefixes are..." > refering to an earlier query, but it doesn't say which of the three > previous it means. Suggest "same query as the previous one" > > > 2.2 Triple Examples > > P1 grammar s/for for/for/ > > P2 "bnodes" introduced without explanation. Should > be "blank node labels" [ref RDF docs] abbreviated to BNodes. > Doesn't say which positions that bnodes can be used in. > > > Definition RDF Term > > This implies that query variables are in the RDF data model since > they are along with U, L and BN. I suggest moving to another > block since V is not used till later. Maybe after/near Query Variable? > > Definition Query Variable > This defines an individual, all the RDF Term definitions are sets. > No letter is assigned to typically use it. > Suggest "A query variable qv". OR define the set Q. > > Defn. Triple Pattern > (spelling, grammar) > "A triple pattern is [a] triple of 3 slots subject, predicate, object .." > > MUSTFIX: "union Q" <- Q is never defined. Q presumably is a set of > Query Variables, in which case it is NOT Q, but a set of qv, or > define Q as a set of qv. > > This also defines 'ground' but that is not pulled out. Suggest > make it a separate 'Definition: Ground' block. > > > Definition Binding > suggest use B for variable, as they are used uppercase elswhere too. > Suggest give an example for the convention for writing down a binding > such as (f, "value") or ?f="value" or the tabular form > --------- > | ?f | > --------- > |"value"| > --------- > > Suggest give an example of a set of bindings such as > {?f="value", ?g="value2"} or the tabular form given later. > > Definition A substitution > suggest uppercase "Substitution" > Suggest not using B as a set of Bindings, but use SB or something > to differ from lowercase 'b' as an individual binding. > So this is a mapping S(set of b) > > How can a set of bindings define a substitution? > Suggest rewording > "A substitution S(B) on a set of bindings B maps a triple pattern ..." > suggest ... "by the corresponding [variable] value" > > Suggest putting a subst() example. > > > Definition Triple Pattern Matching > > MUSTFIX: I think there is a triple pattern/set of triple pattern > issue here unless you are solely comparing a graph with one triple. > > T was earlier defined as a set of triple pattern. So subst(T, b in > B) is not a substitution of a triple pattern, but of a set of > triple patterns (and a binding b in B). Could re-use tp in T which > was used in defining ground, and define subst(tp in T, b in B). > Then edit to match such as 'Triple Pattern tp matches ...' > > Use of entails, reference/link to RDF entailment. > > rdfs: prefix is used in the second data, this was not defined as > convention earlier. brql/sparql predefines rdf: but not rdfs:? > > > 2.3 Graph patterns > > P1 "There are bNodes" No, there is 1. > grammar: "not in the RDF graph [nor in] any query" > > Para "The next query.." but there is no query following. Confused. > Does that mean the query just given > Also grammar: > "one or more triple patterns which must all match for the graph > pattern to match." > - the 'all' and 'one or more' say different things. Is it all or 1? > > Maybe the definition following explains better, remove? > > > Definition: Graph pattern > > MUSTFIX: > "A conjunctive Graph Pattern GP is a set of triple patterns T." > > T was earlier defined as; > "let T be the set of triple patterns := A x A x A" > > So GP=T ? > > Not quite what was meant. GP is set of tp, where > tp is a Triple Pattern in T. > > Maybe triple pattern & triple patterns are too hard to use and make > nice sentences. Other suggestions ; triple pattern set. > > > Defn: Graph Pattern - Conjunction > > Defines "conjunctive Graph Pattern" not the title of the definition. > html - underlining doesn't match too > > > Defn: Graph pattern Matching > > Hmm, confused by "same" in: > "For a graph pattern to match, each triple pattern must match with > each query variable having the same value whereever it occurs." > > suggestions > > "For a graph pattern GP to match, all triple patterns tp in GP must > match with all query variables in all tp having the same value." > > This actually defines "Graph Pattern GP matches", not > "Graph Pattern Matching" > > Using T in GP which is a (set of triple patterns). Probably should > be tp in GP. > > MUSTFIX: > [[ > For all T in GP, subst(T, B) is a triple entailed by G. > subst(GP, B) is the graph pattern formed by subst(T, B) for all T in GP. > subst(GP, B) is a subgraph entailed by G if all triple patterns > are grounded. > ]] > > This is reusing subst(t in TP, b in B) redefined over graphs > I suggest changing the name to graphsubst(GP, B) to distinguish it. > subst(T in TP, b in B) returns a triple pattern, may not be ground. > > Suggestion: > For all tp in GP, subst(tp, B) is a triple pattern entailed by G. > graphsubst(GP, B) is the graph pattern formed by subst(tp, B) for > all tp in GP. > graphsubst(GP, B) is a subgraph entailed by G if all triple > patterns are grounded. > > > 2.4 Multiple Matches > > "The results of query are all the ways a query can match the graph > being queried. Each match is one solution to the query and there > may be zero, one or multiple solutions to a query, depending on the > data." > > This uses "results", "solutions" and "matches", not in the same was > as previously defined. I suggest using results only, and use match > to mean graph matches, triple matches as used above: > > "2.4 Multiple results > > The results of query are all the ways a query can match the graph > being queried. Each result is one solution to the query and there > may be zero, one or multiple results to a query, depending on the > data." > > Aside: A query actually hasn't been defined yet. It's hinted that it > is something to do with graph pattern, but it hasn't been said so > far. i.e. no. > > Or if sticking with "matching" make it clearer what the difference > between a result and a solution is. > > Example query has commas between variables. Die. > > "When the query can match the data in more than one way, each > possibility is returned as a solution to the query. In addition, we > have more than one selected variable so each solution contains two > bindings of variables to values." > > so now there are results, query matches, solutions and possibilities :) > Query matching data hasn't been discussed. Graph patterns matching > Graphs has been, could be reused. Could also refer to sets of bindings. > > ... and now Query Solution is given. > > definition Query Solution: > "For conjucntion graph pattern GP, subst(GP, B), has no variables." > spelling: conjunction. > Also could add ".. and is a set of ground triple patterns" or possibly > define a Ground Graph Pattern. > > > 3 Constraining Values > > (Here the query uses selected variables without a comma) > > > Definition: Value Constraint > "A value constraint is a boolean expression that can be applied to > restrict graph pattern solutions." > For me that doesn't read as an expression that can refer to > non-boolean things as parts of the expression but which has a boolean > value. > > > Definition: Query Stage (partial definition). > > "Graph Pattern (set of triple patterns) + set of Value > Constraints. QS : GP x VR" > > + and x ? + doesn't mean addition here but...? You cannot > join/merge a set of triple patterns and a set of value constraints. > > VR is not defined. Presumably means a set of value constraints. > Later on VC seems to be used for that. > > spelling in comment: [[ operations [like] "source" ]] > > I prefer Query Block. > > > 4 Including Optional Values > > grammar > "The graph matching and value constraints [presented] so far ..." > > [here select vars have no commas] > > html/spelling "there is [an] mbox" - make mbox <tt> too, like in > previous para > > "Failure to match does not ..." > suggest > "failure to match any of the triples in the optional block does not ..." > > spelling "optional block" not bock > > > 4.2 Multiple Optional Blocks > > "Multiple OPTIONAL blocks " > so far the OPTIONAL keyword has not been mentioned, and indeed it is > not given in this section either. Suggest s/<tt>OPTIONAL</tt>/optional/ > in 4.2 > > The constraints on variables seem to allow the same optional variable > to be bound in different nested optional blocks, as long as they are > not at the "given level of nesting" or "in the same containing block". > > Those two constraints seem to clash or at least constrain it in two > ways of which I'm not sure is complete. Level of nesting presumably > doesn't mean, anywhere inside 2 []s. > > How about these: > > Graph Pattern 1: > ( ?q :a :a ) > [ ( ?q :b ?x) ] > [ ( ?q :b ?y) ] > [ ( ?q :b ?x) ] <- same level of nesting, same containing block FORBIDDEN > > Graph Pattern 2: > ( ?q :a :a ) > [ > [ ( ?q :b ?x) ] > [ ( ?q :b ?y) ] > ] > [ ( ?q :b ?x) ] <- different level of nesting, containing block, allowed? > > > > 4.3 Optional Matching > > Definition: Initial Binding > > "The result of a query stage,QS = (GP, VC), with an initial binding > B, has Query Result where all the bindings in B are valid (the graph > pattern and any value constraints in QS). > > B extended with addition bindings given by matching subst(GP, B) > and constraining with VC." > > VC is used here, never defined. Presumably refers to Value Constraint > However Query Stage was earlier (partially) defined as QS: GP x VR > > grammar: "has [a] Query Result", "B [is] extended with addition[al] .." > > MUSTFIX: More substantially; after several re-readings, I don't > understand this definition. Can I ask for some more explanation > please? > > > Definition: Graph Pattern - Optional Match > > "An optional match of QS, with initial binding B, the match of QS > with initial binding B if there exists at least one solution, and is > B otherwise." > > grammar: "binding B, [is] the match..." > > That seems to define an optional match of a query stage, not of a > graph pattern. Is the definition title correct? > > > 5 Nested Patterns > > Nesting was already mentioned in 4.2. > > Definition: Graph Pattern - Nesting > > This definition I note, excludes nested VC - good! > > The example query uses ()s for nesting (you should mention it before the > example what the extra ones are for (which is like lisp (like this))) > > "Since this definition makes a inner pattern just be a conjunctive > element of the outher pattern, and because a graph patterns of > triple patterns is also the conjunction, this is the same as:" > > spelling: outher=>outer > grammar: "because [] graph patterns of [graph] patterns [are] also [] > conjunctions ..." > > > "Optional blocks can be nested. The outer optional must match for any > inner ones to apply. That is, the outer optional triple patterns is > fixed for the purposes of any inner optional block." > > s/triple patterns/graph pattern/ > grammar: "optional [block]" > Let me use that to read from: > "Optional blocks can be nested. The outer optional block must match > for any inner ones to apply. That is, the outer optional graph > pattern is fixed for the purposes of any inner optional block." > > So it means, using nested optional patterns are essentially > subqueries where the outer optional graph pattern is used as a > must-match graph pattern and the inner optional blocks relative to > that as optional graph patterns > > Query result has typo in gname result #3: "EveE should be "Eve" > > grammar: "... query only access[es] these ..." > > This example does hint at the usefulness of the nested patterns > however I think the details of the operation and restrictions on > binding with optionals are incomplete. Maybe add more words to the > intro status for this section re completedness. > > > Sections 6-7: Placeholders > Not reviewed > > > 8 Choosing What to Query > > Definition: Target graph > > "The target graph of a query." > > Ok, this must be a sketch. Especially with the current discussion of > graphs. Maybe expand a little, "... to which a query may be applied". > I recall that we discussed these words and ended up pruning them. > > [[ > SELECT ... > FROM <uri1>, <uri2> > ]] > > Commas, die > > grammar: "Implementations [may] provide " > > > 9 Querying the Origin of Statements > > The status here probably needs expanding to "under discussion and > will change" > > "the following term." > I guess "term" should be triple pattern or nested graph pattern? > Those are the two choices I think. > > > Note " As with OPTIONAL, a variable that is bound to NULL must not > match another variable that is bound to NULL. " > > seems to be worthy of being in the body rather than parenthetical to > the main text. > > Can you delete the red text? All that was notes from 2 FTFs in the > past, we've discussed a lot more things since then and have an issues > list to track things too. > > > 10 Summary of Query Patterns > > Link to the definitions of all the terms here > > Suggest you use QP for query pattern rather than GP - confuses with > graph pattern. > > I don't think it's possible to apply the term 'matches' to > all the elements given here. match is only defined for triple > patterns and graph patterns. > > Could just add a status note to this section that it is initial > draft. > > > 11 Query Forms > > + status note? > > "These result forms use the bindings in the query results to form > result sets or RDF graphs." > > what's a result set? there are Query Results (set of bindings) > and Query Solution. This is the first mention of result set. > Is it not a set of solutions? > > > spelling: "Returns either [an] RDF graph that ..." > > > 11.1 Choosing which Variables to Return > SELECT DISTINCT > > "The result set can be modified by adding the DISTINCT keyword which > ensures that every set of variables for a query solution is different > from the other sets of variables returned. Thought of as a table, > each row is differen" > > "set of variables" should be Query Result; it's the variable names > and values that matter (Bindings) > > > 11.2 Constructing an Output Graph > > "If no pattern is supplied, instead "*" is used," > > s/pattern/graph template/ > > That might be better as > "*" indicates an empty graph template is supplied. > > however that isn't quite right, as when an empty graph template is > used, the variables are instead substituted into the query pattern. > So maybe should be > "*" indicates that the graph template is the query pattern. > 2 paragraphs later, this is spelt out in more detail. > > "... each matching of the query pattern." > => each solution? > > "The form CONSTRUCT * WHERE {query pattern} is shorthand for > CONSTRUCT {pattern} WHERE {pattern}, that is, the query pattern is > the same as the construct pattern. > > Consistency here and elsewhere in 11.2 - use of graph template and > construct pattern for the same thing. > > WHERE {.. }s should be real examples and not using {}s > > Prefer re-ordering to: > "... signifies the construct pattern[graph template] is the > query pattern" > > > 11.3 Descriptions of Resources > placeholder text. > > syntax - n3 needs adding proper example namespace URIs > > > 11.4 Asking "yes or no" questions > > Add a Query Result with either YES or NO suggested format > > > 12 Testing Values > placeholder text. > > > 12.2 Extending Value Testing > placeholder text. > > > > A. SPARQL Grammar > > Some of my previous comments in [1] still apply such as: > * Die CommaOpt > * Use FOO+ not FOO FOO? for one or more > * OPTIONAL keyword > * A ::= B with only one use of A (all non-terminals) should be inlined > * E/BNF used has no reference. Preference to XML's > > Additional: > > What does SOURCE * mean ? > > Add some comments to say why NCCAME, NCCHAR1 is done like this. > Pattern Literal needs expanding too > > No idea what (~[">"," "])* means without consulting some EBNF > documentation; where's that from? complement of set? > > > B. References > > W3C style fixes needed - expanding to have URIs, latest versions, > dates, organisations. > > Check they are cited in the document > >
Received on Monday, 4 October 2004 15:28:41 UTC