- From: Dave Beckett <dave.beckett@bristol.ac.uk>
- Date: Mon, 4 Oct 2004 13:43:00 +0100
- To: andy.seaborne@hp.com
- Cc: Howard Katz <howardk@fatdog.com>, Steve Harris <S.W.Harris@ecs.soton.ac.uk>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Fri, 01 Oct 2004 17:28:35 +0100, "Seaborne, Andy" <andy.seaborne@hp.com> wrote: > Dave, Steve, Howard, > > The SPARQL language doc is ready for review in preparation for the telcon next > Tuesday. Version v1.73 (or later) of: > > http://www.w3.org/2001/sw/DataAccess/rq23/ > > The intention is to publish this rough-and-ready version, complete with editors > notes and comments. This will enable early review, showing where we are going > and enable feedback from the more dedicated part of the community. There is > still much to do in the document but we hope that this public working draft will > indicate the directions we are taking even if the detail is still to be done. This is my initial set of comments, not complete but requested by Eric as I had made it available to him as I drafted it. I may/will change my opinion on some items after I have a readthrough when I get to the end in a few more hours. Dave ---- General: A thorough spellcheck is needed. Label all examples with Numbers, titles and add anchors. Add all example queries, data files as separate files with URIs, link to them. Add them to the test suite. Add labels and anchors to all definitions. Do not use underlining in the html style when it isn't a link. In query results, some of the tables use ?x and some use bare x. Some results use both! Must Fix Consistency in use of individuals, sets of individuals examples: b in B used ok T defined as a set and used as a member of that set, also defined as tp. T in GP should be tp in GP. See also MUSTFIX below Editorial comments - should fix Title: SPARQL title does not mention protocol despite the 'P' in the name. Later on the document suggests that protocol is a separate document. Abstract typo: "end users [missing words] to write" ToC missing 4.3 8 "Chosing What to Query" to match document capitals 12.2 ditto Appendices labelled 1,2 actually A, B in doc suggest removing see also, old material. It's not ToC. 1 Introduction MUSTFIX: First sentence is wrong. The abstract syntax for RDF is not a "graph of nodes and arcs, often expressed as triples". It is a set of triples called an RDF graph formally defined in RDF semantics. It can be and is often described as a graph of nodes and arcs but RDF is not nodes+arcs; that was an RDF core decision closely argued. preference to graph "created dynamically" than "partly calculated on demand" (un-numbered section) Document Outline @@variables bound@@, @@bindings@@ can be linked to forward references "10 - Summary" doesn't match the style of the other paragraphs - no explanation 2 Making Simple Patterns last sentence preference to "[Simple] patterns can be ..." [All graph pictures are unreadable when printed out, too dark. Please re-compose on a light background or with much greater contrast. black on gray doesn't work.] First example. I suggest not using _:1 _:2 since it's not legal in N3, Turtle, N-Triples for blank node labels. I think a small edit can make the first example executable, testable. I'd prefer full names for variables, for easy of readability especially by non-native english speakers. So 'address' not 'addr' and something else instead of 'addrm' 2.1 P2 URIref expand to URI Reference for first use. Or use the correct definition RDF URI Reference and link to it. grammar - "XML. Qname" - delete the "." Link to QName in XML sepcs. datatype URIRef not URI Para "Because.." here and later I see "URIs used" - check for consistency. I suggest s/URI/URIref/ throughout N3/Turtle used without a reference, explanation. Spellings "intpretted" Para "Prefixes are..." refering to an earlier query, but it doesn't say which of the three previous it means. Suggest "same query as the previous one" 2.2 Triple Examples P1 grammar s/for for/for/ P2 "bnodes" introduced without explanation. Should be "blank node labels" [ref RDF docs] abbreviated to BNodes. Doesn't say which positions that bnodes can be used in. Definition RDF Term This implies that query variables are in the RDF data model since they are along with U, L and BN. I suggest moving to another block since V is not used till later. Maybe after/near Query Variable? Definition Query Variable This defines an individual, all the RDF Term definitions are sets. No letter is assigned to typically use it. Suggest "A query variable qv". OR define the set Q. Defn. Triple Pattern (spelling, grammar) "A triple pattern is [a] triple of 3 slots subject, predicate, object .." MUSTFIX: "union Q" <- Q is never defined. Q presumably is a set of Query Variables, in which case it is NOT Q, but a set of qv, or define Q as a set of qv. This also defines 'ground' but that is not pulled out. Suggest make it a separate 'Definition: Ground' block. Definition Binding suggest use B for variable, as they are used uppercase elswhere too. Suggest give an example for the convention for writing down a binding such as (f, "value") or ?f="value" or the tabular form --------- | ?f | --------- |"value"| --------- Suggest give an example of a set of bindings such as {?f="value", ?g="value2"} or the tabular form given later. Definition A substitution suggest uppercase "Substitution" Suggest not using B as a set of Bindings, but use SB or something to differ from lowercase 'b' as an individual binding. So this is a mapping S(set of b) How can a set of bindings define a substitution? Suggest rewording "A substitution S(B) on a set of bindings B maps a triple pattern ..." suggest ... "by the corresponding [variable] value" Suggest putting a subst() example. Definition Triple Pattern Matching MUSTFIX: I think there is a triple pattern/set of triple pattern issue here unless you are solely comparing a graph with one triple. T was earlier defined as a set of triple pattern. So subst(T, b in B) is not a substitution of a triple pattern, but of a set of triple patterns (and a binding b in B). Could re-use tp in T which was used in defining ground, and define subst(tp in T, b in B). Then edit to match such as 'Triple Pattern tp matches ...' Use of entails, reference/link to RDF entailment. rdfs: prefix is used in the second data, this was not defined as convention earlier. brql/sparql predefines rdf: but not rdfs:? 2.3 Graph patterns P1 "There are bNodes" No, there is 1. grammar: "not in the RDF graph [nor in] any query" Para "The next query.." but there is no query following. Confused. Does that mean the query just given Also grammar: "one or more triple patterns which must all match for the graph pattern to match." - the 'all' and 'one or more' say different things. Is it all or 1? Maybe the definition following explains better, remove? Definition: Graph pattern MUSTFIX: "A conjunctive Graph Pattern GP is a set of triple patterns T." T was earlier defined as; "let T be the set of triple patterns := A x A x A" So GP=T ? Not quite what was meant. GP is set of tp, where tp is a Triple Pattern in T. Maybe triple pattern & triple patterns are too hard to use and make nice sentences. Other suggestions ; triple pattern set. Defn: Graph Pattern - Conjunction Defines "conjunctive Graph Pattern" not the title of the definition. html - underlining doesn't match too Defn: Graph pattern Matching Hmm, confused by "same" in: "For a graph pattern to match, each triple pattern must match with each query variable having the same value whereever it occurs." suggestions "For a graph pattern GP to match, all triple patterns tp in GP must match with all query variables in all tp having the same value." This actually defines "Graph Pattern GP matches", not "Graph Pattern Matching" Using T in GP which is a (set of triple patterns). Probably should be tp in GP. MUSTFIX: [[ For all T in GP, subst(T, B) is a triple entailed by G. subst(GP, B) is the graph pattern formed by subst(T, B) for all T in GP. subst(GP, B) is a subgraph entailed by G if all triple patterns are grounded. ]] This is reusing subst(t in TP, b in B) redefined over graphs I suggest changing the name to graphsubst(GP, B) to distinguish it. subst(T in TP, b in B) returns a triple pattern, may not be ground. Suggestion: For all tp in GP, subst(tp, B) is a triple pattern entailed by G. graphsubst(GP, B) is the graph pattern formed by subst(tp, B) for all tp in GP. graphsubst(GP, B) is a subgraph entailed by G if all triple patterns are grounded. 2.4 Multiple Matches "The results of query are all the ways a query can match the graph being queried. Each match is one solution to the query and there may be zero, one or multiple solutions to a query, depending on the data." This uses "results", "solutions" and "matches", not in the same was as previously defined. I suggest using results only, and use match to mean graph matches, triple matches as used above: "2.4 Multiple results The results of query are all the ways a query can match the graph being queried. Each result is one solution to the query and there may be zero, one or multiple results to a query, depending on the data." Aside: A query actually hasn't been defined yet. It's hinted that it is something to do with graph pattern, but it hasn't been said so far. i.e. no. Or if sticking with "matching" make it clearer what the difference between a result and a solution is. Example query has commas between variables. Die. "When the query can match the data in more than one way, each possibility is returned as a solution to the query. In addition, we have more than one selected variable so each solution contains two bindings of variables to values." so now there are results, query matches, solutions and possibilities :) Query matching data hasn't been discussed. Graph patterns matching Graphs has been, could be reused. Could also refer to sets of bindings. ... and now Query Solution is given. definition Query Solution: "For conjucntion graph pattern GP, subst(GP, B), has no variables." spelling: conjunction. Also could add ".. and is a set of ground triple patterns" or possibly define a Ground Graph Pattern. 3 Constraining Values (Here the query uses selected variables without a comma) Definition: Value Constraint "A value constraint is a boolean expression that can be applied to restrict graph pattern solutions." For me that doesn't read as an expression that can refer to non-boolean things as parts of the expression but which has a boolean value. Definition: Query Stage (partial definition). "Graph Pattern (set of triple patterns) + set of Value Constraints. QS : GP x VR" + and x ? + doesn't mean addition here but...? You cannot join/merge a set of triple patterns and a set of value constraints. spelling in comment: [[ operations [like] "source" ]] I prefer Query Block. 4 Including Optional Values .... Review to continue from here ...
Received on Monday, 4 October 2004 12:45:07 UTC