Re: feedback on "SPARQL Query Language for RDF", v1.139

Kevin Wilkinson wrote:
> andy,
>    thanks for the quick turn-around on my comments.
> attached are my comments on your changes.
> 
> kevin
> 
> 
> ------------------------------------------------------------------------
> 
> comments on your changes (now referring to version 1.142
> of the SPARQL Mdraft).
> 
> "Seaborne, Andy" wrote:
> 
>>Kevin,
>>
>>Thank you for such a detailed set of comments, and thank you marking up the text.
>>
>>Changes logged below.
>>
>>Kevin Wilkinson wrote:
>>
>>>attached are my comments on v1.139 of the SPARQL spec.
>>
> ...
> 
>>Changes in v1.141 until noted otherwise.
>>
>>
>>>    1 Introduction
>>>
>>>An RDF graph is a set of triples, each consisting of a /+subject+/, an
>>>+object+, and +/predicate/ that specifies+ a property relationship
>>>between them+,+ as defined in RDF Concepts and Abstract syntax
>>><http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Datatypes-intro>.
>>
>>A/ Can't see the change in subject and object.
> 
> 
> the suggested change was to italicize subject, predicate and object.
> your convention in other places in the document is to italicize
> terms when they are first introduced (or for emphasis). i
> thought it was appropriate to italicize subj, pred, obj. 

OK - I see now.

> 
> as for predicate vs. property, consistency is important. but,
> it might be good to imply here that predicate and property are
> synonyms because property crops up in other places, e.g.,
> rdf:Property, InverseFunctionalProperty, etc.
> 
> 
> 
>>>    2 Making Simple Queries
> 
> ...
> 
>>Leave this to Eric.
> 
> 
> re: graph1, graphPattern1, don't neglect to change the headers
> in the result table, i.e., referrer, reference, author are all
> wrong. i forgot to mention this in my pervious message.
> 
> 
>>>      2.1 Writing a Simple Query
> 
> ...
> 
>>>The terms delimited by "<>" are URI References [13] <#ref13> (URIRefs);
>>>URIRefs can also abbreviated with an XML QName-like form [14] <#ref14>;
>>>this is syntactic assistance and is translated to the full URIRef.
>>>-Other RDF terms-+The terms delimited by double quotes+ are literals
>>>which, following N-Triples syntax [7] <#ref7>, are a string and
>>>-optional language tag (introduced with '@') and datatype URIRef
>>>(introduced by '^^')-+optionally, either a language tag (indicated by
>>>'@') or a datatype URIRef (indicated by '^^')+.
>>
>>Changed to:
>>"""
>>The RDF terms delimited by double quotes ("") are literals which, following
>>N-Triples syntax [7], are a string, in quotes, an optional language tag,
>>introduced with '@', and optional datatype URIRef, introduced by '^^'.
>>"""
> 
> 
> but your phrasing admits the possibility of a literal having
> a language tag and a datatype. that's why i prefer my original
> wording, "...in quotes, optionally followed by either a
> language tag ... or a datatype ...".

s/and/or/

But I don't see thr role of this document to define things defined elsewhere 
Such text is illustrative.

> 
> 
> 
>>>      2.2 Triple Patterns
> 
> ...
> 
>>>-In SPARQL, a triple pattern is an RDF triple but with the addition that
>>>components can be a query variable instead.-
>>>
>>>+In SPARQL, a triple pattern is an RDF triple in which any component can
>>>be a query variable.+
>>
>>A triple pattern is not an RDF triple if it has different contents.  Text left
>>as is.
> 
> 
> but your phrasing also makes it sounds like a triple pattern is an
> RDF triple.  how about "... triple pattern is +similar to+ an RDF
> triple but with the addition ..."

Done - but I didn't read the old text that way!

> 
> 
>>>*Definition:* Substitution
>>>
>>>A substitution S is a partial functional relation from variables to RDF
>>>terms or variables. We write S[v] for the RDF term that S pairs with the
>>>variable v and define S[v] to be v where there is no such pairing.
>>>Â
>>>
>>>*Definition:* Triple Pattern Matching
>>>
>>>For +substitution+ S and Triple Pattern T, S(T) is -the-+a+ triple
>>>pattern +formed+ by replacing any variable v in T with S[v]. (KW
>>>Comment: there may be more than one such triple pattern, correct?)
>>
>>No - a substitution is a function and is well-defined.  Applied to a triple
>>pattern there is only one triple pattern produced.
> 
> 
> i found the notation S(T) confusing since S is a function
> from variables; i don't know what S(T) is. a different
> function? can't a variable be bond to multiple terms? that's
> why i thought S(T) would produce multiple triples.

There are induced functions S:T->T, S:GP->GP which all naturally arise from the 
idea of substitution.  It's usual to use the same name for this (polymorphic) 
function.

As S is a function, one variable can only be bound to one term.  Otherwise its 
not a function, just a relation.

> 
> 
>>>Triple Pattern T matches RDF graph G with substitution S, if S(T) is a
>>>triple of G.
>>>
>>>(KW Comment: the above definition (of Triple Pattern T match G) is a
>>>second definition of triple pattern matching. Previously, at the start
>>>of section 2.2, you say that a pattern matches all triples with
>>>"identical" RDF terms. Is it obvious that these two definitions are
>>>identical? Maybe prefix the first definition by saying it is an informal
>>>definition.)
>>
>>I hope the use of the boxes does that informal/formal.  Will consider - theer is
>>also a comment outstanmding from Yoshio about putting all definitions before the
>>nararrative text.  Probably not possible at the very start of the doc.
> 
> 
> i agree with yoshio. i prefer that definitions precede the examples.

Will try - but after publication.

> 
> 
>>>For example, the query:
>>>
>>>SELECT * WHERE ( ?x ?x ?v )
> 
> 
> perhaps "SELECT ?x, ?y, ?z" is better than "SELECT *"
> since the '*' form of Select is not defined until section 10.

Good point. Done.  No comma though!

> 
> 
> 
>>>      2.3 Graph Patterns
>>>
>>>-The keyword WHERE is followed by a /Graph Pattern/ which is made of one
>>>or more /Triple Patterns/. These Triple Patterns are "and"ed together.
>>>More formally, the Graph Pattern is the conjunction of the Triple
>>>Patterns. In each query solution, all the triple patterns must be
>>>satisfied with the same binding of variables to values.-
>>>
>>>+A /Graph Pattern/ is one or more /Triple Patterns /"and"ed together,
>>>i.e., a conjunction of Triple Patterns. In a match, all the triple
>>>patterns must be satisfied with the same binding of variables to values.+
>>
>>Trying to, informally, explain the syntax at this point, hence the keyword
>>WHERE.  Avoid solution though.
> 
> 
> i still prefer my wording since the objective of 2.3 is to explain
> graph patterns. graph patterns do not need to have "WHERE" in front
> of them. if you want to mention the WHERE word, i suggest doing that
> in section 2.1 where you introduce querying.
> 

Leave as is - the early sections can become full of definitions and no progress 
for the reader who is reading examples.  Will review in another draft.

> 
> 
>>>Data:
>>>
>>>@prefix foaf:    <http://xmlns.com/foaf/0.1/> .
>>>
>>>_:a  foaf:name   "Johnny Lee Outlaw" .
>>>_:a  foaf:mbox   <mailto:jlow@example.com> .
>>>
>>
>>"""
>>There is a bNode [12] in this dataset, identified by _:a. The label is only used
>>with the file for encoding purposes. The label information is not in the RDF
>>graph. No query will be able to identify that bNode by the label used in the
>>serialization.
>>"""
> 
> 
> i would suggest dropping the above paragraph. there is no need
> to introduce the concept of bnode labels at this point. it's
> distracting from the discussion of graph patterns. bnode
> labels/serialization are covered just fine in 2.5.

I think we need to explain _:a at this point.

Exzplaingin the handling of bNodes in queries has been a constant issue and so 
I'd like to face it head-on.

> 
> 
> 
>>>*Definition:* Graph Pattern (Partial Definition) â?? Conjunction
>>>
>>
>>The defintion of "matching" is being built up through the document.  Each
>>definition has a qualifier - here the defintion is "Graph Pattern Matching".
> 
> 
> except that the definition of triple matching skirts the
> issue by defining match in terms of "contains". perhaps
> that's intentional. but, issues like plain literals being
> equivalent to xsd:string-typed literals and issues with
> bnodes are not mentioned.
> 
> 
> 
>>And what's more, on reflection, I don't think "simply entails" is necessary.
>>Subgraph would be clearer and *at this point* the definitions don't rely on
>>entailment.  The binding really does have the bNode as its value.  It's later,
>>on encoding results, that this is broken.  It must be the same bNode to match
>>again later.
> 
> 
> good. i agree that subgraph would be clearer.

Awaiting WG discussion - if the theorists point out anything that it misses we 
will stick with "entails" (and that needs a nunber of changes elsewhere).  But 
query is over eth symbols of the graph, not individuals in the domain of discourse.

> 
> 
>>>      2.4 Multiple Matches
>>>
> 
> ...
> 
>>> _:a foaf:name  "Johnny Lee Outlaw" .
>>> _:a foaf:box   <mailto:jlow@example.com> .
>>>
>>> _:b foaf:name  "Peter Goodguy" .
>>> _:b foaf:box   <mailto:peter@example.org> .
>>>
>>>(KW Comment: I don't like the above example because it illustrates two
>>>concepts. First, it shows that a query may have multiple solutions.
>>>That's fine. But, it also illustrates the results can be a projection of
>>>the query variables. This raises additional questions, specifically, how
>>>are duplicates handled. I'd feel better if this example included
>>>variable 'x' in the result list.).
>>
>>Point taken.  However, (1) this isn't the first time projection has happened and
>>(2) avoiding bNodes in results is desirable for clarity.  The only option would
>>be to not use FOAF but then we wouldn't have that is familiar to at least some
>>people.  Having synthetic data is rather dry.  In this example, there aren't
>>duplicates.
> 
> 
> would it be too bizarre to have URIRef's in place of the bnodes, e.g.,
>      ex:a  foaf:name  "Johnny Lee Outlaw" . ...

Yes - it would.  FOAF uses bNodes for people usually. I'd rather stick to be 
realistic.

> 
> 
> 
>>>*Definition:* Query Solution
>>>
>>>A Query Solution is a Pattern Solution where the pattern is the whole
>>>pattern of the query.
>>>
>>>*Definition:* Query Results
>>>
>>>The Query Results, for a given graph pattern GP on G, is written
>>>R(GP,G), and is the set of all query solutions such that GP matches G.
>>>
>>>R(GP, G) may be the empty set.
>>>
>>>
>>>      2.5 Blank Nodes
>>>
>>>
>>>        Blank Nodes and Queries
>>>
>>
>>"""
>>BNodes can't appear in a SPARQL query. There is no standard representation of
>>bNodes in RDF and the syntax of SPARQL queries does not allow them.
>>
>>They do take part in the pattern matching process.
> 
> 
> perhaps add "take part in the pattern matching process +by being
> bound to variables in triple patterns+".
> 
> 
>>"""
>>Committed version 1.141

Committed version 1.143
> 
> 
> 
> i also had feedback on sections 3-6 and 10. i assume
> you got that feedback in my original message and are
> still processing those changes. if not, please let
> me know and i can resend them.

Later ...

> 
> kevin

Received on Thursday, 2 December 2004 12:19:19 UTC