W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2004

comments/questions on SPARQL document

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 5 Oct 2004 14:48:58 -0500
Message-Id: <p06001f19bd88ab0e8bca@[10.100.0.161]>
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
http://www.w3.org/2001/sw/DataAccess/rq23/#introduction
link from contents is broken.
'set of statements' / 'set of triples'
Is the list of three kinds of graph intended to be exhaustive? 
Suggest not, so say something like "A graph may be encoded in a 
variety of forms; for example, ..."
'based on  information' has extra NBspace
No reference for [protocol]
Outline:
'the SPARQL' (the?)
section 5 is (delete 'is')
Why are fonts used in section 9 and section 11 descriptions different?
Document Conventions: suggest delete first sentence.

http://www.w3.org/2001/sw/DataAccess/rq23/#basicpatterns
HTML anchor is misplaced inside the header
" Patterns are descriptions of graphs with" / "Patterns are like 
graphs but may have"
(reason: calling them descriptions suggests a metalanguage, and lets 
not go there.)
"graph labels or relationships" / "nodes or predicates" (use a link to
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-data-model 
if needed to justify terminology)
In general, avoid the term 'graph label' throughout. The RDF core WG 
went through hell until we decided that RDF graphs are not labelled 
graphs in the mathematical sense. There are no labels in RDF: the 
nodes *are* the labels. (Hence they are unique.)
Graph labe...sorry, nodes and predicates, are not values as defined 
in [Concepts]: those are datatype values, see 
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Datatypes-intro
. So:
"The graph labels are values as defines in [..]" / "Nodes (as defined 
in [concepts])  may be"  or some such.
Also, why did you leave out blank nodes ?
(BTW, there isn't a single term for nodes-or-predicates in the RDF 
concepts terminology, I now realize. [Later: you provide one in 
section 2.2, so use it here.])

"The result of a query is a set of mappings from variables to values 
such that, for each mapping, assigning values to variables in the 
query produces a subgraph of the target graph."
/
"A <em>binding </em> is a mapping from the variables in a query to 
terms. A <em> result mapping </em> is a binding which, when applied 
to the variables in the query, produces a subgraph of the target 
graph; a <em> result </em> is a set of result mappings. If there are 
no result mappings, the result set is empty."

(BTW, does the result have to be the set of *all* possible result 
mappings or just include some of the possible mappings? If the former 
this should be stated.)
"we will get one solution with three variable bindings" /"we will get 
one result mapping which binds three variables".
(A single mapping may bind more than one variable)

http://www.w3.org/2001/sw/DataAccess/rq23/#WritingSimpleQueries
HTML anchor is misplaced
"It is also possible to write integers and floating point doubles directly."
Really?? That is, without using a typed literal? Yuck, I suggest this 
is a bad idea: RDF recidivism!!. In any case, why FP doubles, for 
goodness' sake?

"Variables are indicated by '?'; the '?' does not form part of the 
variables name."
This belongs in earlier section.  Also, why does the query not form 
part of the name? In fact, what does this idea of the 'name' of a 
variable mean? Surely the variable *is* the name, right? Why not just 
say, variables start with the character '?' ?

Delicate point in the example. The title here is an RDF plain 
literal, right? So does the result binding bind the variable to the 
literal, or to the character string? (Suppose the literal had a lang 
tag?)

intepretted/interpreted
therefor/therefore
Text flows oddly: prefixes, data, typed literals, prefixes again. 
Reorder the last paragraph?

http://www.w3.org/2001/sw/DataAccess/rq23/#TriplePatterns
HTML anchor is misplaced (They seem to all be like this, so I will 
not mention it again.)

Definitions should have been stated earlier, or forward links 
provided. Also they can be tidied up. Right now they are a bit of a 
mess and I think they have mistakes in, but am not sure. See below.
.
Blank nodes is some set disjoint from U and L (see [Concepts]) and we 
can use the same trick, where query variables V is a set disjoint 
from U union L union BN. We don't need to define it. Or, we can 
define it to be the set of all strings starting with '?', and remark 
that this is disjoint from U union L union BN. The first is more of 
an abstract-syntax way of doing things.
A triple is anything in (U union BN) x U x (U union BL union L), and 
an RDF graph is a set of triples. [Concepts].
OK, following this, a triple pattern is anything in
(U union BN union V) x (U union V) x (U union BN union L union V)
and a pattern is a set of triple patterns.

I'd suggest not using 'ground' to mean lack of variables as RDF 
already uses 'ground' to mean lack of bnodes. How about 
'variable-free' or 'complete' if we want a single word. But do we 
even need this concept? In the case of query variables, unlike 
bnodes, do we even want to consider an instance of one query still 
being a query with some variables still not bound? We surely do not 
want to consider instances which bind one query variable to another, 
right? (??)

Before suggesting rewordings, let me get the definitions straight. 
What it says is:
"Triple Pattern T matches graph G with binding set B if subst(T, B) 
is a ground triple and, as a triple, is entailed by G."

Where presumably 'ground' here means in the sense defined here, ie 
not containing a query variable: (and NOT in the sense defined in 
RDF, meaning not containing a bnode). So under this assumption, then, 
this would be an example:
G= { ex:a ex:p ex:b }
T= {?x ex:p ex:b}
with the result having ?x bound to a blank node:
?x//_:1
since G simply entails {_:1 ex:P ex:b}

BTW, if 'ground' here means as in RDF, i.e. having no bnodes, then 
the first example in the document is wrong, since it binds variables 
to bnodes and hence produce non-RDF-ground instances.

What exactly does 'entails' mean? The above example assumes simple 
entailment (the weakest interpretation.)  If entails means 
RDF-entails, then this would be an example:
G = { } (the empty graph)
T= {?x rdf:type rdf:Property}
with the result having ?x bound to rdf:type, since
rdf:type rdf:type rdf:Property
is RDF-valid all by itself, ie is RDF-entailed by anything.

Sorry I was out of the loop long enough to not know the answer, but 
is this really what is intended here?

----

The Graph pattern match definition in section 2.3 refers to the 
triple definition, and so requires only that each triple of the 
instance is entailed, rather than the instance graph itself be 
entailed. This are not the same thing. For example:

G = {ex:a ex:p ex:b
ex:b ex:p ex:c
ex:c ex:p ex:a}

T= {?x ex:p ?y
?y ex:p ?x}

then binding ? x and ?y to distinct blank nodes will give an instance

_:1 ex:p _:2
_:2 ex:p _:1

which is not itself entailed by G but every triple of which is 
entailed separately by G when considered in isolation. So is this a 
legal match? (suggestion: no. But it is according to the current 
definitions.)

------

I think I had better stop at this point and wait for some answers 
before proceeding.

Pat



-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 5 October 2004 19:50:12 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:21 GMT