- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 5 Oct 2004 02:42:14 -0400
- To: Dave Beckett <dave.beckett@bristol.ac.uk>
- Cc: Dan Connolly <connolly@w3.org>, Andy Seaborne <andy.seaborne@hp.com>, Howard Katz <howardk@fatdog.com>, Steve Harris <S.W.Harris@ecs.soton.ac.uk>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-ID: <20041005064214.GC20897@w3.org>
Thanks for the thorough read, Dave.
This branched thread is frustrating. I guess the burden is on me to
integrate my earlier responses, but I won't do it again if the thread
splits again.
On Mon, Oct 04, 2004 at 03:14:03PM +0100, Dave Beckett wrote:
>
> Reviewing
>
> http://www.w3.org/2001/sw/DataAccess/rq23/
>
> $Log: Overview.html,v $
> Revision 1.77 2004/10/03 13:06:28 eric
>
> Completed. I'll now take a look at 1.77->1.79 changes
>
>
> Dave
>
>
> Items that I think must be fixed before publication
> ---------------------------------------------------
>
> See also MUSTFIX in detailed notes below. Summarising:
>
> * First sentence in 1. Introduction is wrong. RDF is a set of triples.
addressed 112 lines below
> * Consistency in use of individuals, sets of individuals examples:
> b in B used ok however T defined as a set and used as a member of
> that set, also defined as tp. T in GP should be tp in GP.
>
> See comments on definitions of Triple Pattern, Triple Pattern
> Matching, Graph Pattern, Graph Pattern Matching
>
> * Initial Binding definition baffles me, I need more explanation.
Nack. I'm leaving the formal definition issues to Andy
> General Comments
> ----------------
>
> A thorough spellcheck is needed.
Will do with validation in the next step.
> Label all examples with Numbers, titles and add anchors.
> Add all example queries, data files as separate files with URIs, link
> to them. Add them to the test suite.
> Add labels and anchors to all definitions.
1.78:
I haven't labeled any yet, but I copied the exampleOuter /
exampleInner template from RDF Primer (and XQuery). There is a wad of
disabled style associated with this. Andy, you can enable it to check
it out -- it puts cool boxes around stuff. My macro grabbed the class
from the inner
<pre class="query|query todo?/>
so it will be easy to make the outer boxes the right color.
The RDF Primary sometimes has
<div class="exampleOuter exampleInner">
and sometimes has
<div class="exampleOuter">
<div class="c1">
<a id="example20" name="example20">Example 20: ...
using <code>rdf:ID</code></a>
</div>
<div class="exampleInner">
I haven't worked to figure out the criteria for what needs an
anchor and title.
> Do not use underlining in the html style when it isn't a link.
1.78:
I'm not sure that's such a problem, but I change the name of the
style from "underline" to "definedTerm" and changed the style to
text-decoration : underline
> In query results, some of the tables use ?x and some use bare x.
> Some results use both!
1.79:
changed all the x
> Suggest global s/<tt>OPTIONAL</tt>/optional/ since the OPTIONAL
> keyword is never explained in the document and only appears in the
> grammar.
1.82
I added text showing the two alternatives.
> Detailed Comments
> ------------------
> These should be fixed but are not critical.
>
>
> Title: SPARQL title does not mention protocol despite the 'P' in the
> name.
>
> Later on the document suggests that protocol is a separate document.
1.81:
title now "SPARQL Query Language for RDF"
> Abstract
> typo: "end users [missing words] to write"
1.79:
now: "end users a way to write"
figured it shouldn't be off-puttingly formal
> ToC
> missing 4.3
1.79:
added
> 8 "Chosing What to Query" to match document capitals
1.79:
fixed
> 12.2 ditto
1.79:
fixed 12.1 and 12.2
> Appendices labelled 1,2 actually A, B in doc
1.79:
now A, B
> suggest removing see also, old material. It's not ToC.
>
>
> 1 Introduction
>
> MUSTFIX: First sentence is wrong.
>
> The abstract syntax for RDF is not a "graph of nodes and arcs, often
> expressed as triples". It is a set of triples called an RDF graph
> formally defined in RDF semantics. It can be and is often described
> as a graph of nodes and arcs but RDF is not nodes+arcs; that was an
> RDF core decision closely argued.
1.82:
[[http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-data-model
The underlying structure of any expression in RDF is a collection of
triples, each consisting of a subject, a predicate and an object. A
set of such triples is called an RDF graph (defined more formally in
section 6). This can be illustrated by a node and directed-arc
diagram, in which each triple is represented as a node-arc-node link
(hence the term "graph").
]]
An RDF graph is a set of statements, each consisting of a subject, an
object, and a relationship between them <a href="#ref12">[12]</a>.
The graph may be real (materialized), where there is a document that
is the serialization of the graph or an RDF database containing the
statements, it may be a graph that is partly calculated on demand
giving the inference closure, or it may be an RDF representation of a
legacy database.
Liberties I took for approachability: (Pat, can you check these?)
s/abstract syntax for RDF/RDF data/
s/collection of triples/set of triples/ -- is it really *not* a set?
s/triple/statement/ -- does it read better with "triple"? does it
reach the same audience?
I didn't tie an RDF graph to the more introductory notion of RDF
data. Any ideas out there?
> preference to graph "created dynamically" than "partly calculated on demand"
Nack. I wanted to keep "partly" and "partly created dynamically" was
akward.
> (un-numbered section) Document Outline
> @@variables bound@@, @@bindings@@ can be linked to forward references
1.82:
I think they were optional wordings. I wordsmithed it and removed teh
'@@'s.
> "10 - Summary" doesn't match the style of the other paragraphs - no
> explanation
1.82:
fixed
> 2 Making Simple Patterns
> last sentence preference to "[Simple] patterns can be ..."
Nack. That would imply that complicated patterns cannot be combined.
> [All graph pictures are unreadable when printed out, too dark.
> Please re-compose on a light background or with much greater
> contrast. black on gray doesn't work.]
Punting. Image work is time consuming. Will do later.
> First example. I suggest not using _:1 _:2 since it's not legal in
> N3, Turtle, N-Triples for blank node labels. I think a small edit
> can make the first example executable, testable.
Punting. Dan raised that too, gave me the option of punting.
> I'd prefer full names for variables, for easy of readability
> especially by non-native english speakers. So 'address' not 'addr'
> and something else instead of 'addrm'
Punting. Requires revisiting images.
> 2.1
> P2
> URIref expand to URI Reference for first use. Or use the
> correct definition RDF URI Reference and link to it.
> grammar - "XML. Qname" - delete the "."
> Link to QName in XML sepcs.
> datatype URIRef not URI
1.82:
done
> Para "Because.."
> here and later I see "URIs used" - check for consistency. I suggest
> s/URI/URIref/ throughout
1.82:
fixed specific point.
I made the substituion in 2 other places: graph label, function name
> N3/Turtle used without a reference, explanation.
>
> Spellings "intpretted"
Nack. not found.
> Para "Prefixes are..."
> refering to an earlier query, but it doesn't say which of the three
> previous it means. Suggest "same query as the previous one"
1.82:
This query is equivalent to the previous one
and will therefor have the same results:
> 2.2 Triple Examples
>
> P1 grammar s/for for/for/
1.82:
fixed
> P2 "bnodes" introduced without explanation. Should
> be "blank node labels" [ref RDF docs] abbreviated to BNodes.
> Doesn't say which positions that bnodes can be used in.
1.82:
added
while there, I noticed an odd statement:
[[http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-blank-nodes
Given two blank nodes, it is possible to determine whether or not they
are the same.
]]
gosh, that's ambitious. Given two bNodes and a bunch of other
inferencing sources, it *may* be possible that they are the
same. unless i'm just confused...
> Definition RDF Term
>
> This implies that query variables are in the RDF data model since
> they are along with U, L and BN. I suggest moving to another
> block since V is not used till later. Maybe after/near Query Variable?
>
> Definition Query Variable
> This defines an individual, all the RDF Term definitions are sets.
> No letter is assigned to typically use it.
> Suggest "A query variable qv". OR define the set Q.
>
> Defn. Triple Pattern
> (spelling, grammar)
> "A triple pattern is [a] triple of 3 slots subject, predicate, object .."
>
> MUSTFIX: "union Q" <- Q is never defined. Q presumably is a set of
> Query Variables, in which case it is NOT Q, but a set of qv, or
> define Q as a set of qv.
>
> This also defines 'ground' but that is not pulled out. Suggest
> make it a separate 'Definition: Ground' block.
>
>
> Definition Binding
> suggest use B for variable, as they are used uppercase elswhere too.
> Suggest give an example for the convention for writing down a binding
> such as (f, "value") or ?f="value" or the tabular form
> ---------
> | ?f |
> ---------
> |"value"|
> ---------
>
> Suggest give an example of a set of bindings such as
> {?f="value", ?g="value2"} or the tabular form given later.
>
> Definition A substitution
> suggest uppercase "Substitution"
> Suggest not using B as a set of Bindings, but use SB or something
> to differ from lowercase 'b' as an individual binding.
> So this is a mapping S(set of b)
>
> How can a set of bindings define a substitution?
> Suggest rewording
> "A substitution S(B) on a set of bindings B maps a triple pattern ..."
> suggest ... "by the corresponding [variable] value"
>
> Suggest putting a subst() example.
>
>
> Definition Triple Pattern Matching
>
> MUSTFIX: I think there is a triple pattern/set of triple pattern
> issue here unless you are solely comparing a graph with one triple.
>
> T was earlier defined as a set of triple pattern. So subst(T, b in
> B) is not a substitution of a triple pattern, but of a set of
> triple patterns (and a binding b in B). Could re-use tp in T which
> was used in defining ground, and define subst(tp in T, b in B).
> Then edit to match such as 'Triple Pattern tp matches ...'
>
> Use of entails, reference/link to RDF entailment.
Nack. I'm leaving the formal definition issues to Andy
> rdfs: prefix is used in the second data, this was not defined as
> convention earlier. brql/sparql predefines rdf: but not rdfs:?
1.82:
I don't think SPARQL predefines either, and am opposed to it doing
so. For the purposes of simplicty in the document, I've written a
conventions section:
[[http://www.w3.org/2001/sw/DataAccess/rq23/#optionals
Document Conventions
Examples in this document may or may not include common namespace
declarations. When undeclared, the namespace rdf stands in place of
http://www.w3.org/1999/02/22-rdf-syntax-ns# and the namespace rdfs
stands in place of http://www.w3.org/2000/01/rdf-schema#.
]]
> 2.3 Graph patterns
>
> P1 "There are bNodes" No, there is 1.
> grammar: "not in the RDF graph [nor in] any query"
1.82:
change number here and in next sentence.
> Para "The next query.." but there is no query following. Confused.
> Does that mean the query just given
> Also grammar:
> "one or more triple patterns which must all match for the graph
> pattern to match."
> - the 'all' and 'one or more' say different things. Is it all or 1?
>
> Maybe the definition following explains better, remove?
1.82:
now:
[[
This query contains a conjunctive graph pattern. A conjunctive graph
pattern is two or more triple patterns, each of which must match for
the graph pattern to match.
]]
> Definition: Graph pattern
>
> MUSTFIX:
> "A conjunctive Graph Pattern GP is a set of triple patterns T."
>
> T was earlier defined as;
> "let T be the set of triple patterns := A x A x A"
>
> So GP=T ?
>
> Not quite what was meant. GP is set of tp, where
> tp is a Triple Pattern in T.
>
> Maybe triple pattern & triple patterns are too hard to use and make
> nice sentences. Other suggestions ; triple pattern set.
>
>
> Defn: Graph Pattern - Conjunction
>
> Defines "conjunctive Graph Pattern" not the title of the definition.
> html - underlining doesn't match too
>
>
> Defn: Graph pattern Matching
>
> Hmm, confused by "same" in:
> "For a graph pattern to match, each triple pattern must match with
> each query variable having the same value whereever it occurs."
>
> suggestions
>
> "For a graph pattern GP to match, all triple patterns tp in GP must
> match with all query variables in all tp having the same value."
>
> This actually defines "Graph Pattern GP matches", not
> "Graph Pattern Matching"
>
> Using T in GP which is a (set of triple patterns). Probably should
> be tp in GP.
>
> MUSTFIX:
> [[
> For all T in GP, subst(T, B) is a triple entailed by G.
> subst(GP, B) is the graph pattern formed by subst(T, B) for all T in GP.
> subst(GP, B) is a subgraph entailed by G if all triple patterns are grounded.
> ]]
>
> This is reusing subst(t in TP, b in B) redefined over graphs
> I suggest changing the name to graphsubst(GP, B) to distinguish it.
> subst(T in TP, b in B) returns a triple pattern, may not be ground.
>
> Suggestion:
> For all tp in GP, subst(tp, B) is a triple pattern entailed by G.
> graphsubst(GP, B) is the graph pattern formed by subst(tp, B) for
> all tp in GP.
> graphsubst(GP, B) is a subgraph entailed by G if all triple
> patterns are grounded.
Nack. I'm leaving the formal definition issues to Andy
==================== commiting and taking a break ====================
> 2.4 Multiple Matches
>
> "The results of query are all the ways a query can match the graph
> being queried. Each match is one solution to the query and there
> may be zero, one or multiple solutions to a query, depending on the
> data."
>
> This uses "results", "solutions" and "matches", not in the same was
> as previously defined. I suggest using results only, and use match
> to mean graph matches, triple matches as used above:
>
> "2.4 Multiple results
>
> The results of query are all the ways a query can match the graph
> being queried. Each result is one solution to the query and there
> may be zero, one or multiple results to a query, depending on the
> data."
>
> Aside: A query actually hasn't been defined yet. It's hinted that it
> is something to do with graph pattern, but it hasn't been said so
> far. i.e. no.
>
> Or if sticking with "matching" make it clearer what the difference
> between a result and a solution is.
>
> Example query has commas between variables. Die.
>
> "When the query can match the data in more than one way, each
> possibility is returned as a solution to the query. In addition, we
> have more than one selected variable so each solution contains two
> bindings of variables to values."
>
> so now there are results, query matches, solutions and possibilities :)
> Query matching data hasn't been discussed. Graph patterns matching
> Graphs has been, could be reused. Could also refer to sets of bindings.
>
> ... and now Query Solution is given.
>
> definition Query Solution:
> "For conjucntion graph pattern GP, subst(GP, B), has no variables."
> spelling: conjunction.
> Also could add ".. and is a set of ground triple patterns" or possibly
> define a Ground Graph Pattern.
>
>
> 3 Constraining Values
>
> (Here the query uses selected variables without a comma)
>
>
> Definition: Value Constraint
> "A value constraint is a boolean expression that can be applied to
> restrict graph pattern solutions."
> For me that doesn't read as an expression that can refer to
> non-boolean things as parts of the expression but which has a boolean
> value.
>
>
> Definition: Query Stage (partial definition).
>
> "Graph Pattern (set of triple patterns) + set of Value
> Constraints. QS : GP x VR"
>
> + and x ? + doesn't mean addition here but...? You cannot
> join/merge a set of triple patterns and a set of value constraints.
>
> VR is not defined. Presumably means a set of value constraints.
> Later on VC seems to be used for that.
>
> spelling in comment: [[ operations [like] "source" ]]
>
> I prefer Query Block.
>
>
> 4 Including Optional Values
>
> grammar
> "The graph matching and value constraints [presented] so far ..."
>
> [here select vars have no commas]
>
> html/spelling "there is [an] mbox" - make mbox <tt> too, like in
> previous para
>
> "Failure to match does not ..."
> suggest
> "failure to match any of the triples in the optional block does not ..."
>
> spelling "optional block" not bock
>
>
> 4.2 Multiple Optional Blocks
>
> "Multiple OPTIONAL blocks "
> so far the OPTIONAL keyword has not been mentioned, and indeed it is
> not given in this section either. Suggest s/<tt>OPTIONAL</tt>/optional/
> in 4.2
>
> The constraints on variables seem to allow the same optional variable
> to be bound in different nested optional blocks, as long as they are
> not at the "given level of nesting" or "in the same containing block".
>
> Those two constraints seem to clash or at least constrain it in two
> ways of which I'm not sure is complete. Level of nesting presumably
> doesn't mean, anywhere inside 2 []s.
>
> How about these:
>
> Graph Pattern 1:
> ( ?q :a :a )
> [ ( ?q :b ?x) ]
> [ ( ?q :b ?y) ]
> [ ( ?q :b ?x) ] <- same level of nesting, same containing block FORBIDDEN
>
> Graph Pattern 2:
> ( ?q :a :a )
> [
> [ ( ?q :b ?x) ]
> [ ( ?q :b ?y) ]
> ]
> [ ( ?q :b ?x) ] <- different level of nesting, containing block, allowed?
>
>
>
> 4.3 Optional Matching
>
> Definition: Initial Binding
>
> "The result of a query stage,QS = (GP, VC), with an initial binding
> B, has Query Result where all the bindings in B are valid (the graph
> pattern and any value constraints in QS).
>
> B extended with addition bindings given by matching subst(GP, B)
> and constraining with VC."
>
> VC is used here, never defined. Presumably refers to Value Constraint
> However Query Stage was earlier (partially) defined as QS: GP x VR
>
> grammar: "has [a] Query Result", "B [is] extended with addition[al] .."
>
> MUSTFIX: More substantially; after several re-readings, I don't
> understand this definition. Can I ask for some more explanation
> please?
>
>
> Definition: Graph Pattern - Optional Match
>
> "An optional match of QS, with initial binding B, the match of QS
> with initial binding B if there exists at least one solution, and is
> B otherwise."
>
> grammar: "binding B, [is] the match..."
>
> That seems to define an optional match of a query stage, not of a
> graph pattern. Is the definition title correct?
>
>
> 5 Nested Patterns
>
> Nesting was already mentioned in 4.2.
>
> Definition: Graph Pattern - Nesting
>
> This definition I note, excludes nested VC - good!
>
> The example query uses ()s for nesting (you should mention it before the
> example what the extra ones are for (which is like lisp (like this)))
>
> "Since this definition makes a inner pattern just be a conjunctive
> element of the outher pattern, and because a graph patterns of
> triple patterns is also the conjunction, this is the same as:"
>
> spelling: outher=>outer
> grammar: "because [] graph patterns of [graph] patterns [are] also []
> conjunctions ..."
>
>
> "Optional blocks can be nested. The outer optional must match for any
> inner ones to apply. That is, the outer optional triple patterns is
> fixed for the purposes of any inner optional block."
>
> s/triple patterns/graph pattern/
> grammar: "optional [block]"
> Let me use that to read from:
> "Optional blocks can be nested. The outer optional block must match
> for any inner ones to apply. That is, the outer optional graph
> pattern is fixed for the purposes of any inner optional block."
>
> So it means, using nested optional patterns are essentially
> subqueries where the outer optional graph pattern is used as a
> must-match graph pattern and the inner optional blocks relative to
> that as optional graph patterns
>
> Query result has typo in gname result #3: "EveE should be "Eve"
>
> grammar: "... query only access[es] these ..."
>
> This example does hint at the usefulness of the nested patterns
> however I think the details of the operation and restrictions on
> binding with optionals are incomplete. Maybe add more words to the
> intro status for this section re completedness.
>
>
> Sections 6-7: Placeholders
> Not reviewed
>
>
> 8 Choosing What to Query
>
> Definition: Target graph
>
> "The target graph of a query."
>
> Ok, this must be a sketch. Especially with the current discussion of
> graphs. Maybe expand a little, "... to which a query may be applied".
> I recall that we discussed these words and ended up pruning them.
>
> [[
> SELECT ...
> FROM <uri1>, <uri2>
> ]]
>
> Commas, die
>
> grammar: "Implementations [may] provide "
>
>
> 9 Querying the Origin of Statements
>
> The status here probably needs expanding to "under discussion and will change"
>
> "the following term."
> I guess "term" should be triple pattern or nested graph pattern?
> Those are the two choices I think.
>
>
> Note " As with OPTIONAL, a variable that is bound to NULL must not
> match another variable that is bound to NULL. "
>
> seems to be worthy of being in the body rather than parenthetical to
> the main text.
>
> Can you delete the red text? All that was notes from 2 FTFs in the
> past, we've discussed a lot more things since then and have an issues
> list to track things too.
>
>
> 10 Summary of Query Patterns
>
> Link to the definitions of all the terms here
>
> Suggest you use QP for query pattern rather than GP - confuses with
> graph pattern.
>
> I don't think it's possible to apply the term 'matches' to
> all the elements given here. match is only defined for triple
> patterns and graph patterns.
>
> Could just add a status note to this section that it is initial
> draft.
>
>
> 11 Query Forms
>
> + status note?
>
> "These result forms use the bindings in the query results to form
> result sets or RDF graphs."
>
> what's a result set? there are Query Results (set of bindings)
> and Query Solution. This is the first mention of result set.
> Is it not a set of solutions?
>
>
> spelling: "Returns either [an] RDF graph that ..."
>
>
> 11.1 Choosing which Variables to Return
> SELECT DISTINCT
>
> "The result set can be modified by adding the DISTINCT keyword which
> ensures that every set of variables for a query solution is different
> from the other sets of variables returned. Thought of as a table,
> each row is differen"
>
> "set of variables" should be Query Result; it's the variable names
> and values that matter (Bindings)
>
>
> 11.2 Constructing an Output Graph
>
> "If no pattern is supplied, instead "*" is used,"
>
> s/pattern/graph template/
>
> That might be better as
> "*" indicates an empty graph template is supplied.
>
> however that isn't quite right, as when an empty graph template is
> used, the variables are instead substituted into the query pattern.
> So maybe should be
> "*" indicates that the graph template is the query pattern.
> 2 paragraphs later, this is spelt out in more detail.
>
> "... each matching of the query pattern."
> => each solution?
>
> "The form CONSTRUCT * WHERE {query pattern} is shorthand for
> CONSTRUCT {pattern} WHERE {pattern}, that is, the query pattern is
> the same as the construct pattern.
>
> Consistency here and elsewhere in 11.2 - use of graph template and
> construct pattern for the same thing.
>
> WHERE {.. }s should be real examples and not using {}s
>
> Prefer re-ordering to:
> "... signifies the construct pattern[graph template] is the query pattern"
>
>
> 11.3 Descriptions of Resources
> placeholder text.
>
> syntax - n3 needs adding proper example namespace URIs
>
>
> 11.4 Asking "yes or no" questions
>
> Add a Query Result with either YES or NO suggested format
>
>
> 12 Testing Values
> placeholder text.
>
>
> 12.2 Extending Value Testing
> placeholder text.
>
>
>
> A. SPARQL Grammar
>
> Some of my previous comments in [1] still apply such as:
> * Die CommaOpt
> * Use FOO+ not FOO FOO? for one or more
> * OPTIONAL keyword
> * A ::= B with only one use of A (all non-terminals) should be inlined
> * E/BNF used has no reference. Preference to XML's
>
> Additional:
>
> What does SOURCE * mean ?
>
> Add some comments to say why NCCAME, NCCHAR1 is done like this.
> Pattern Literal needs expanding too
>
> No idea what (~[">"," "])* means without consulting some EBNF
> documentation; where's that from? complement of set?
>
>
> B. References
>
> W3C style fixes needed - expanding to have URIs, latest versions,
> dates, organisations.
>
> Check they are cited in the document
>
--
-eric
office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
Shonan Fujisawa Campus, Keio University,
5322 Endo, Fujisawa, Kanagawa 252-8520
JAPAN
+1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell: +1.857.222.5741 (does not work in Asia)
(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Tuesday, 5 October 2004 06:42:15 UTC