- From: Souripriya Das <souripriya.das@oracle.com>
- Date: Thu, 26 Jan 2006 19:55:29 -0500
- To: Pat Hayes <phayes@ihmc.us>
- CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-ID: <43D96F81.6010306@oracle.com>
Pat, This is an excellent response. I would like to point two things however. Pat Hayes wrote: > > <<After volunteering for this I noticed that Dan had already responded > to this message with an [OK?], so this might now be redundant. But > here goes anyway.>> > > Fred, greetings. > > You make several points about blank nodes in SPARQL queries, and we > will respond to them in sequence. Your first point: > >> Blank nodes of the form _:a and [ ] do not add anything to the language. >> Everything that can be expressed with such blank nodes can be expressed >> with variables. > > > is correct. The language has a syntactic redundancy. Some members of > the working group agree with your conclusion. We considered > prohibiting blank nodes in queries, but this would impose an extra > syntactic burden on someone wishing to form query patterns by editing > query variables into RDF. We also considered not having unselected > variables and requiring what are now unselected variables to be > replaced by blank nodes, but again this imposes a burden on users > while providing no extra utility. In neither case did the conceptual > simplification seem worth the operational burden on users. > > There is however a deeper reason for distinguishing query blank nodes > from query variables, which addresses your next point: > >> What is the difference semantically between >> _:a and ?a ? > > > Extending SPARQL to richer entailment modes can make them semantically > different. When simple entailment is replaced by OWL entailment in the > SPARQL basic definitions, it is possible for an existential to be > OWL-entailed by a graph which contains no token which would be a > binder for a query variable: OWL supports 'genuinely existential' > entailments. For one of many possible examples, if the OWL asserts > that :a is in a restriction class of :p to :c with cardinality one, > this entails the assertion > > :a :p _:x . > _:x rdf:type :c > > but provides no term to bind the query variable ?x to in the query > pattern > > :a :p ?x . > ?x rdf:type :c > > so the query > > SELECT ?y WHERE { ?y :p _:u , _:u rdf:type :c } > > would succeed with x bound to :a, but the corresponding query > > SELECT ?y WHERE { ?y :p ?u , ?u rdf:type :c } > > might rationally be said to fail; all when using OWL entailment. > Admittedly, this case is controversial. One could argue that even in > the second case, it would be sensible to require that the query engine > provide a blank node identifier as an answer binding. But the working > group felt that it would be prudent to leave the option open for > future designers of OWL versions of SPARQL, which motivates keeping > the blank-node/variable distinction in the syntax. > One could argue as follows: The entailed OWL graph (as shown above) does include two triples that contain a blank-node (represented via some label, shown as _:x here). So, for the second query above, why shouldn't one generate a solution that substitutes the query variable ?u to a blank-node (represented via some label, say :_x1)? Are we 'failing' the second query to limit the values for the variables in the solution to the scoping set of original (i.e., non-entailed) graph? > Your next point is best addressed by discussing blank node scopes. > >> The only difference I can see is that _:a can not be >> placed in the SELECT list (and there does not appear to be any >> motivation for this). Thus if the user, in the course of writing a >> query, later decided he wants to receive the value of the blank node, >> he must rewrite the query with a variable in place of the blank node. >> The user might as well just write the query without blank nodes from >> the beginning. > > > There really is no such thing in SPARQL as the 'value' of a query > blank node. Blank node identifiers in queries are scoped to the query, > and indicate an existential assertion. > > In the course of checking the simple entailment relationship between > the target graph and the pattern instance such a blank node must be > 'mapped' to some term in the target graph, to be sure, but this > mapping is distinct from the variable-to-binding instance mapping: it > does not identify that term in any sense; rather, the presence of the > mapped term simply confirms the truth of the existential claim made by > the presence of the blank node. This also gets to your next point: > >> In addition, the term "blank node" creates a false analogy with RDF. >> An RDF blank node is a node in a graph with no IRI. A SPARQL blank node >> is not a node at all, it is actually a variable that cannot be named in >> the SELECT list. > > > We disagree. It is exactly an RDF blank node, and the analogy is not > false. Do not think of a query bnode as a 'blank variable': think > instead of the entire query basic graph pattern as an RDF graph with > some 'named holes' in it, the query variables. The query answer is a > vector of pieces of RDF syntax which, when syntactically substituted > for the variables, produces (an appropriate lexicalization of ) an RDF > graph which is simply entailed by the target graph[*]. What if the pattern contains a blank-node in the predicate position? Then the entailed instance is not a valid RDF graph according to current restrictions in RDF which says predicates cannot be blank-nodes. If we are allowing this in SPARQL, maybe we should state this explicitly. > All of this is purely syntactic, but the entailment relationship > between this instance and the target graph, that makes the answer a > genuine answer, is semantic. Blank nodes in the query pattern are > genuine RDF blank nodes in the entailed instance, and the entailment > relationship holds between two RDF graphs. > > Simple entailment is indeed so simple that it can be defined in terms > of a mapping from blank nodes to RDF terms: A simply entails B just > when B has an RDF instance (gotten by mapping from blank nodes to > terms) which is a subgraph of A. So, to check the required > relationship between a target graph A and a basic graph pattern C, we > need an instance mapping M on the variables in C and then another N on > the blank nodes in M(C) such that N(M(C)) is a subgraph of A. In this > simple case, then, this is equivalent to asking for a single mapping > on variables and blank nodes which produces an instance [N+M](C) which > is a subgraph of A, then ignoring part of it. But there is a real > conceptual distinction, which is reflected in the definitions, between > the two parts of this composite mapping; and when simple entailment is > replaced by more advanced forms of entailment, the distinction can > become operationally important. > > Pat > > [*] (In fact, it is simply entailed by a 'scoping graph' which is > graph-equivalent to the target graph under a blank node substitution, > but this complication is just to allow blank nodes to be scoped > separately in the answer document.) > > Pat
Received on Friday, 27 January 2006 00:57:07 UTC