- From: Souripriya Das <souripriya.das@oracle.com>
- Date: Thu, 26 Jan 2006 19:55:29 -0500
- To: Pat Hayes <phayes@ihmc.us>
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-ID: <43D96F81.6010306@oracle.com>
Pat,
This is an excellent response. I would like to point two things however.
Pat Hayes wrote:
>
> <<After volunteering for this I noticed that Dan had already responded
> to this message with an [OK?], so this might now be redundant. But
> here goes anyway.>>
>
> Fred, greetings.
>
> You make several points about blank nodes in SPARQL queries, and we
> will respond to them in sequence. Your first point:
>
>> Blank nodes of the form _:a and [ ] do not add anything to the language.
>> Everything that can be expressed with such blank nodes can be expressed
>> with variables.
>
>
> is correct. The language has a syntactic redundancy. Some members of
> the working group agree with your conclusion. We considered
> prohibiting blank nodes in queries, but this would impose an extra
> syntactic burden on someone wishing to form query patterns by editing
> query variables into RDF. We also considered not having unselected
> variables and requiring what are now unselected variables to be
> replaced by blank nodes, but again this imposes a burden on users
> while providing no extra utility. In neither case did the conceptual
> simplification seem worth the operational burden on users.
>
> There is however a deeper reason for distinguishing query blank nodes
> from query variables, which addresses your next point:
>
>> What is the difference semantically between
>> _:a and ?a ?
>
>
> Extending SPARQL to richer entailment modes can make them semantically
> different. When simple entailment is replaced by OWL entailment in the
> SPARQL basic definitions, it is possible for an existential to be
> OWL-entailed by a graph which contains no token which would be a
> binder for a query variable: OWL supports 'genuinely existential'
> entailments. For one of many possible examples, if the OWL asserts
> that :a is in a restriction class of :p to :c with cardinality one,
> this entails the assertion
>
> :a :p _:x .
> _:x rdf:type :c
>
> but provides no term to bind the query variable ?x to in the query
> pattern
>
> :a :p ?x .
> ?x rdf:type :c
>
> so the query
>
> SELECT ?y WHERE { ?y :p _:u , _:u rdf:type :c }
>
> would succeed with x bound to :a, but the corresponding query
>
> SELECT ?y WHERE { ?y :p ?u , ?u rdf:type :c }
>
> might rationally be said to fail; all when using OWL entailment.
> Admittedly, this case is controversial. One could argue that even in
> the second case, it would be sensible to require that the query engine
> provide a blank node identifier as an answer binding. But the working
> group felt that it would be prudent to leave the option open for
> future designers of OWL versions of SPARQL, which motivates keeping
> the blank-node/variable distinction in the syntax.
>
One could argue as follows: The entailed OWL graph (as shown above)
does include two triples that contain a blank-node (represented via some
label, shown as _:x here). So, for the second query above, why shouldn't
one generate a solution that substitutes the query variable ?u to a
blank-node (represented via some label, say :_x1)?
Are we 'failing' the second query to limit the values for the variables
in the solution to the scoping set of original (i.e., non-entailed) graph?
> Your next point is best addressed by discussing blank node scopes.
>
>> The only difference I can see is that _:a can not be
>> placed in the SELECT list (and there does not appear to be any
>> motivation for this). Thus if the user, in the course of writing a
>> query, later decided he wants to receive the value of the blank node,
>> he must rewrite the query with a variable in place of the blank node.
>> The user might as well just write the query without blank nodes from
>> the beginning.
>
>
> There really is no such thing in SPARQL as the 'value' of a query
> blank node. Blank node identifiers in queries are scoped to the query,
> and indicate an existential assertion.
>
> In the course of checking the simple entailment relationship between
> the target graph and the pattern instance such a blank node must be
> 'mapped' to some term in the target graph, to be sure, but this
> mapping is distinct from the variable-to-binding instance mapping: it
> does not identify that term in any sense; rather, the presence of the
> mapped term simply confirms the truth of the existential claim made by
> the presence of the blank node. This also gets to your next point:
>
>> In addition, the term "blank node" creates a false analogy with RDF.
>> An RDF blank node is a node in a graph with no IRI. A SPARQL blank node
>> is not a node at all, it is actually a variable that cannot be named in
>> the SELECT list.
>
>
> We disagree. It is exactly an RDF blank node, and the analogy is not
> false. Do not think of a query bnode as a 'blank variable': think
> instead of the entire query basic graph pattern as an RDF graph with
> some 'named holes' in it, the query variables. The query answer is a
> vector of pieces of RDF syntax which, when syntactically substituted
> for the variables, produces (an appropriate lexicalization of ) an RDF
> graph which is simply entailed by the target graph[*].
What if the pattern contains a blank-node in the predicate position?
Then the entailed instance is not a valid RDF graph according to current
restrictions in RDF which says predicates cannot be blank-nodes. If we
are allowing this in SPARQL, maybe we should state this explicitly.
> All of this is purely syntactic, but the entailment relationship
> between this instance and the target graph, that makes the answer a
> genuine answer, is semantic. Blank nodes in the query pattern are
> genuine RDF blank nodes in the entailed instance, and the entailment
> relationship holds between two RDF graphs.
>
> Simple entailment is indeed so simple that it can be defined in terms
> of a mapping from blank nodes to RDF terms: A simply entails B just
> when B has an RDF instance (gotten by mapping from blank nodes to
> terms) which is a subgraph of A. So, to check the required
> relationship between a target graph A and a basic graph pattern C, we
> need an instance mapping M on the variables in C and then another N on
> the blank nodes in M(C) such that N(M(C)) is a subgraph of A. In this
> simple case, then, this is equivalent to asking for a single mapping
> on variables and blank nodes which produces an instance [N+M](C) which
> is a subgraph of A, then ignoring part of it. But there is a real
> conceptual distinction, which is reflected in the definitions, between
> the two parts of this composite mapping; and when simple entailment is
> replaced by more advanced forms of entailment, the distinction can
> become operationally important.
>
> Pat
>
> [*] (In fact, it is simply entailed by a 'scoping graph' which is
> graph-equivalent to the target graph under a blank node substitution,
> but this complication is just to allow blank nodes to be scoped
> separately in the answer document.)
>
> Pat
Received on Friday, 27 January 2006 00:57:07 UTC