- From: Pat Hayes <phayes@ihmc.us>
- Date: Thu, 26 Jan 2006 16:50:59 -0600
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
<<After volunteering for this I noticed that Dan
had already responded to this message with an
[OK?], so this might now be redundant. But here
goes anyway.>>
Fred, greetings.
You make several points about blank nodes in
SPARQL queries, and we will respond to them in
sequence. Your first point:
>Blank nodes of the form _:a and [ ] do not add anything to the language.
>Everything that can be expressed with such blank nodes can be expressed
>with variables.
is correct. The language has a syntactic
redundancy. Some members of the working group
agree with your conclusion. We considered
prohibiting blank nodes in queries, but this
would impose an extra syntactic burden on someone
wishing to form query patterns by editing query
variables into RDF. We also considered not having
unselected variables and requiring what are now
unselected variables to be replaced by blank
nodes, but again this imposes a burden on users
while providing no extra utility. In neither case
did the conceptual simplification seem worth the
operational burden on users.
There is however a deeper reason for
distinguishing query blank nodes from query
variables, which addresses your next point:
>What is the difference semantically between
>_:a and ?a ?
Extending SPARQL to richer entailment modes can
make them semantically different. When simple
entailment is replaced by OWL entailment in the
SPARQL basic definitions, it is possible for an
existential to be OWL-entailed by a graph which
contains no token which would be a binder for a
query variable: OWL supports 'genuinely
existential' entailments. For one of many
possible examples, if the OWL asserts that :a is
in a restriction class of :p to :c with
cardinality one, this entails the assertion
:a :p _:x .
_:x rdf:type :c
but provides no term to bind the query variable ?x to in the query pattern
:a :p ?x .
?x rdf:type :c
so the query
SELECT ?y WHERE { ?y :p _:u , _:u rdf:type :c }
would succeed with x bound to :a, but the corresponding query
SELECT ?y WHERE { ?y :p ?u , ?u rdf:type :c }
might rationally be said to fail; all when using
OWL entailment. Admittedly, this case is
controversial. One could argue that even in the
second case, it would be sensible to require that
the query engine provide a blank node identifier
as an answer binding. But the working group felt
that it would be prudent to leave the option open
for future designers of OWL versions of SPARQL,
which motivates keeping the blank-node/variable
distinction in the syntax.
Your next point is best addressed by discussing blank node scopes.
> The only difference I can see is that _:a can not be
>placed in the SELECT list (and there does not appear to be any
>motivation for this). Thus if the user, in the course of writing a
>query, later decided he wants to receive the value of the blank node,
>he must rewrite the query with a variable in place of the blank node.
>The user might as well just write the query without blank nodes from
>the beginning.
There really is no such thing in SPARQL as the
'value' of a query blank node. Blank node
identifiers in queries are scoped to the query,
and indicate an existential assertion.
In the course of checking the simple entailment
relationship between the target graph and the
pattern instance such a blank node must be
'mapped' to some term in the target graph, to be
sure, but this mapping is distinct from the
variable-to-binding instance mapping: it does not
identify that term in any sense; rather, the
presence of the mapped term simply confirms the
truth of the existential claim made by the
presence of the blank node. This also gets to
your next point:
>In addition, the term "blank node" creates a false analogy with RDF.
>An RDF blank node is a node in a graph with no IRI. A SPARQL blank node
>is not a node at all, it is actually a variable that cannot be named in
>the SELECT list.
We disagree. It is exactly an RDF blank node, and
the analogy is not false. Do not think of a query
bnode as a 'blank variable': think instead of the
entire query basic graph pattern as an RDF graph
with some 'named holes' in it, the query
variables. The query answer is a vector of pieces
of RDF syntax which, when syntactically
substituted for the variables, produces (an
appropriate lexicalization of ) an RDF graph
which is simply entailed by the target graph[*].
All of this is purely syntactic, but the
entailment relationship between this instance and
the target graph, that makes the answer a genuine
answer, is semantic. Blank nodes in the query
pattern are genuine RDF blank nodes in the
entailed instance, and the entailment
relationship holds between two RDF graphs.
Simple entailment is indeed so simple that it can
be defined in terms of a mapping from blank nodes
to RDF terms: A simply entails B just when B has
an RDF instance (gotten by mapping from blank
nodes to terms) which is a subgraph of A. So, to
check the required relationship between a target
graph A and a basic graph pattern C, we need an
instance mapping M on the variables in C and then
another N on the blank nodes in M(C) such that
N(M(C)) is a subgraph of A. In this simple case,
then, this is equivalent to asking for a single
mapping on variables and blank nodes which
produces an instance [N+M](C) which is a subgraph
of A, then ignoring part of it. But there is a
real conceptual distinction, which is reflected
in the definitions, between the two parts of this
composite mapping; and when simple entailment is
replaced by more advanced forms of entailment,
the distinction can become operationally
important.
Pat
[*] (In fact, it is simply entailed by a 'scoping
graph' which is graph-equivalent to the target
graph under a blank node substitution, but this
complication is just to allow blank nodes to be
scoped separately in the answer document.)
Pat
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 cell
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 26 January 2006 22:51:07 UTC