Re: SPARQL / Language spec ready for review

some issues address in v1.79

On Mon, Oct 04, 2004 at 01:43:00PM +0100, Dave Beckett wrote:
> 
> On Fri, 01 Oct 2004 17:28:35 +0100, "Seaborne, Andy" <andy.seaborne@hp.com> wrote:
> 
> > Dave, Steve, Howard,
> > 
> > The SPARQL language doc is ready for review in preparation for the telcon next 
> > Tuesday.  Version v1.73 (or later) of:
> > 
> >      http://www.w3.org/2001/sw/DataAccess/rq23/
> > 
> > The intention is to publish this rough-and-ready version, complete with editors 
> > notes and comments.  This will enable early review, showing where we are going 
> > and enable feedback from the more dedicated part of the community.  There is 
> > still much to do in the document but we hope that this public working draft will 
> > indicate the directions we are taking even if the detail is still to be done.
> 
> This is my initial set of comments, not complete but requested by Eric
> as I had made it available to him as I drafted it.  I may/will change my
> opinion on some items after I have a readthrough when I get to the end
> in a few more hours.
> 
> Dave
> 
> ----
> 
> General:
> A thorough spellcheck is needed.
> 
> Label all examples with Numbers, titles and add anchors.
> Add all example queries, data files as separate files with URIs, link
> to them.  Add them to the test suite.
> Add labels and anchors to all definitions.

I haven't labeled any yet, but I copied the exampleOuter /
exampleInner template from RDF Primer (and XQuery). There is a wad of
disabled style associated with this. Andy, you can enable it to check
it out -- it puts cool boxes around stuff. My macro grabbed the class
from the inner
  <pre class="query|query todo?/>
so it will be easy to make the outer boxes the right color.

The RDF Primary sometimes has
        <div class="exampleOuter exampleInner">
and sometimes has
        <div class="exampleOuter">
          <div class="c1">
            <a id="example20" name="example20">Example 20: ...
            using <code>rdf:ID</code></a>
          </div>
          <div class="exampleInner">
I haven't worked to figure out the criteria for what needs an
anchor and title. 

> Do not use underlining in the html style when it isn't a link.

I'm not sure that's such a problem, but I change the name of the
style from "underline" to "definedTerm" and changed the style to
text-decoration : underline

> In query results, some of the tables use ?x and some use bare x.
> Some results use both!

changing all the x (without regard to the actual variable name).

> 
> Must Fix
> 
> Consistency in use of individuals, sets of individuals
> examples:
>   b in B used ok
>   T defined as a set and used as a member of that set, also defined
>   as tp.  T in GP should be tp in GP.
> 
> See also MUSTFIX below
> 
> 
> Editorial comments - should fix
> 
> 
> Title: SPARQL title does not mention protocol despite the 'P' in the
> name.  

Best I could come with was "SPARQL Query - A Language for Querying RDF".

> Later on the document suggests that protocol is a separate document.
> 
> 
> Abstract
> typo: "end users [missing words] to write"

end users a way to write

figured it shouldn't be off-puttingly formal

> ToC
> missing 4.3

added

> 8 "Chosing What to Query" to match document capitals

fixed

> 12.2 ditto

fixed 12.1 and 12.2

> Appendices labelled 1,2 actually A, B in doc

now A, B


=============== Going home before grocery store closes ================

cheers all

> suggest removing see also, old material.  It's not ToC.
> 
> 
> 1 Introduction
> 
> MUSTFIX: First sentence is wrong.
> 
> The abstract syntax for RDF is not a "graph of nodes and arcs, often
> expressed as triples".  It is a set of triples called an RDF graph
> formally defined in RDF semantics.  It can be and is often described
> as a graph of nodes and arcs but RDF is not nodes+arcs; that was an
> RDF core decision closely argued.
> 
> preference to graph "created dynamically" than "partly calculated on demand"
> 
> 
> (un-numbered section) Document Outline
> @@variables bound@@, @@bindings@@ can be linked to forward references
> 
> "10 - Summary" doesn't match the style of the other paragraphs - no
> explanation
> 
> 
> 2 Making Simple Patterns
> last sentence preference to "[Simple] patterns can be ..."
> 
> [All graph pictures are unreadable when printed out, too dark.
> Please re-compose on a light background or with much greater
> contrast. black on gray doesn't work.]
> 
> First example.  I suggest not using _:1 _:2 since it's not legal in
> N3, Turtle, N-Triples for blank node labels.  I think a small edit
> can make the first example executable, testable.
> 
> I'd prefer full names for variables, for easy of readability
> especially by non-native english speakers.  So 'address' not 'addr'
> and something else instead of 'addrm'
> 
> 
> 2.1
> P2
> URIref expand to URI Reference for first use. Or use the
> correct definition RDF URI Reference and link to it.
> grammar - "XML. Qname" - delete the "."
> Link to QName in XML sepcs.
> datatype URIRef not URI
> 
> Para "Because.."
> here and later I see "URIs used" - check for consistency.  I suggest
> s/URI/URIref/ throughout
> 
> N3/Turtle used without a reference, explanation.
> 
> Spellings "intpretted"
> 
> Para "Prefixes are..."
> refering to an earlier query, but it doesn't say which of the three
> previous it means.  Suggest "same query as the previous one"
> 
> 
> 2.2 Triple Examples
> 
> P1 grammar s/for for/for/
> 
> P2 "bnodes" introduced without explanation.  Should
> be "blank node labels" [ref RDF docs] abbreviated to BNodes.
> Doesn't say which positions that bnodes can be used in.
> 
> 
> Definition RDF Term
> 
> This implies that query variables are in the RDF data model since
> they are along with U, L and BN.  I suggest moving to another
> block since V is not used till later.  Maybe after/near Query Variable?
> 
> Definition Query Variable
> This defines an individual, all the RDF Term definitions are sets.
> No letter is assigned to typically use it.
> Suggest "A query variable qv".  OR define the set Q.
> 
> Defn. Triple Pattern
> (spelling, grammar)
> "A triple pattern is [a] triple of 3 slots subject, predicate, object .."
> 
> MUSTFIX: "union Q" <- Q is never defined.  Q presumably is a set of
>   Query Variables, in which case it is NOT Q, but a set of qv, or
>   define Q as a set of qv.
> 
> This also defines 'ground' but that is not pulled out.  Suggest
> make it a separate 'Definition: Ground' block.
> 
> 
> Definition Binding
> suggest use B for variable, as they are used uppercase elswhere too.
> Suggest give an example for the convention for writing down a binding
> such as (f, "value") or ?f="value" or the tabular form
> ---------
> |  ?f   |
> ---------
> |"value"|
> ---------
> 
> Suggest give an example of a set of bindings such as
> {?f="value", ?g="value2"} or the tabular form given later.
> 
> Definition A substitution
> suggest uppercase "Substitution"
> Suggest not using B as a set of Bindings, but use SB or something
> to differ from lowercase 'b' as an individual binding.
> So this is a mapping S(set of b)
> 
> How can a set of bindings define a substitution?
> Suggest rewording
> "A substitution S(B) on a set of bindings B maps a triple pattern ..."
> suggest ... "by the corresponding [variable] value"
> 
> Suggest putting a subst() example.
> 
> 
> Definition Triple Pattern Matching
> 
> MUSTFIX: I think there is a triple pattern/set of triple pattern
>   issue here unless you are solely comparing a graph with one triple.
> 
>   T was earlier defined as a set of triple pattern. So subst(T, b in
>   B) is not a substitution of a triple pattern, but of a set of
>   triple patterns (and a binding b in B).  Could re-use tp in T which
>   was used in defining ground, and define subst(tp in T, b in B).
>   Then edit to match such as 'Triple Pattern tp matches ...'
> 
> Use of entails, reference/link to RDF entailment.
> 
> rdfs: prefix is used in the second data, this was not defined as
> convention earlier.  brql/sparql predefines rdf: but not rdfs:?
> 
> 
> 2.3 Graph patterns
> 
> P1 "There are bNodes"  No, there is 1.
> grammar: "not in the RDF graph [nor in] any query"
> 
> Para "The next query.." but there is no query following.  Confused.
> Does that mean the query just given
> Also grammar:
>   "one or more triple patterns which must all match for the graph
>   pattern to match."
> - the 'all' and 'one or more' say different things.  Is it all or 1?
> 
> Maybe the definition following explains better, remove?
> 
> 
> Definition: Graph pattern
> 
> MUSTFIX:
>  "A conjunctive Graph Pattern GP is a set of triple patterns T."
> 
>   T was earlier defined as;
>     "let T be the set of triple patterns := A x A x A"
> 
>   So GP=T ?
> 
>   Not quite what was meant.  GP is set of tp, where 
>   tp is a Triple Pattern in T.
> 
> Maybe triple pattern & triple patterns are too hard to use and make
> nice sentences.  Other suggestions ; triple pattern set.
> 
> 
> Defn: Graph Pattern - Conjunction
> 
> Defines "conjunctive Graph Pattern" not the title of the definition.
> html - underlining doesn't match too
> 
> 
> Defn: Graph pattern Matching
> 
> Hmm, confused by "same" in:
> "For a graph pattern to match, each triple pattern must match with
> each query variable having the same value whereever it occurs."
> 
> suggestions
> 
> "For a graph pattern GP to match, all triple patterns tp in GP must
>  match with all query variables in all tp having the same value."
> 
> This actually defines "Graph Pattern GP matches", not
> "Graph Pattern Matching"
> 
> Using T in GP which is a (set of triple patterns).  Probably should
> be tp in GP.
> 
> MUSTFIX:
>   [[ 
>   For all T in GP, subst(T, B) is a triple entailed by G.
>   subst(GP, B) is the graph pattern formed by subst(T, B) for all T in GP.
>   subst(GP, B) is a subgraph entailed by G if all triple patterns are grounded.
>   ]]
> 
>   This is reusing subst(t in TP, b in B) redefined over graphs
>   I suggest changing the name to graphsubst(GP, B) to distinguish it.
>   subst(T in TP, b in B) returns a triple pattern, may not be ground.
> 
>   Suggestion:
>     For all tp in GP, subst(tp, B) is a triple pattern entailed by G.
>     graphsubst(GP, B) is the graph pattern formed by subst(tp, B) for
>       all tp in GP. 
>     graphsubst(GP, B) is a subgraph entailed by G if all triple
>       patterns are grounded.
> 
> 
> 2.4 Multiple Matches
> 
>   "The results of query are all the ways a query can match the graph
>   being queried.  Each match is one solution to the query and there
>   may be zero, one or multiple solutions to a query, depending on the
>   data."
> 
> This uses "results", "solutions" and "matches", not in the same was
> as previously defined. I suggest using results only, and use match
> to mean graph matches, triple matches as used above:
> 
>   "2.4 Multiple results
> 
>   The results of query are all the ways a query can match the graph
>   being queried.  Each result is one solution to the query and there
>   may be zero, one or multiple results to a query, depending on the
>   data."
> 
> Aside: A query actually hasn't been defined yet.  It's hinted that it
> is something to do with graph pattern, but it hasn't been said so
> far. i.e. no.
> 
> Or if sticking with "matching" make it clearer what the difference
> between a result and a solution is.
> 
> Example query has commas between variables.  Die.
> 
>   "When the query can match the data in more than one way, each
>   possibility is returned as a solution to the query.  In addition, we
>   have more than one selected variable so each solution contains two
>   bindings of variables to values."
> 
> so now there are results, query matches, solutions and possibilities :)
> Query matching data hasn't been discussed.  Graph patterns matching
> Graphs has been, could be reused. Could also refer to sets of bindings.
> 
> ... and now Query Solution is given.
> 
> definition Query Solution:
>   "For conjucntion graph pattern GP, subst(GP, B), has no variables."
> spelling: conjunction. 
> Also could add ".. and is a set of ground triple patterns" or possibly
> define a Ground Graph Pattern.
> 
> 
> 3 Constraining Values
> 
> (Here the query uses selected variables without a comma)
> 
> 
> Definition: Value Constraint
>   "A value constraint is a boolean expression that can be applied to
>   restrict graph pattern solutions."
> For me that doesn't read as an expression that can refer to
> non-boolean things as parts of the expression but which has a boolean
> value.
> 
> 
> Definition: Query Stage (partial definition).
> 
>   "Graph Pattern (set of triple patterns) + set of Value
>   Constraints. QS : GP x VR"
> 
> + and x ? + doesn't mean addition here but...?  You cannot
> join/merge a set of triple patterns and a set of value constraints.
> 
> spelling in comment: [[ operations [like] "source"  ]]
> 
> I prefer Query Block.
> 
> 
> 4 Including Optional Values
> 
> .... Review to continue from here ...

-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741 (does not work in Asia)

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Monday, 4 October 2004 13:05:02 UTC