W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2006

Re: Draft response to: Re: major technical: blank nodes

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 27 Jan 2006 12:39:08 -0600
Message-Id: <p0623091cc0001561dea3@[]>
To: Souripriya Das <souripriya.das@oracle.com>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>, "Seaborne, Andy" <andy.seaborne@hp.com>

>This is an excellent response

Its only a draft at present :-)

>. I would like to point two things however.
>Pat Hayes wrote:
>>>What is the difference semantically between
>>>_:a and ?a ?
>>Extending SPARQL to richer entailment modes can make them 
>>semantically different. When simple entailment is replaced by OWL 
>>entailment in the SPARQL basic definitions, it is possible for an 
>>existential to be OWL-entailed by a graph which contains no token 
>>which would be a binder for a query variable: OWL supports 
>>'genuinely existential' entailments. For one of many possible 
>>examples, if the OWL asserts that :a is in a restriction class of 
>>:p to :c with cardinality one, this entails the assertion
>>:a :p _:x .
>>_:x rdf:type :c
>>but provides no term to bind the query variable ?x to in the query pattern
>>:a :p ?x .
>>?x rdf:type :c
>>so the query
>>SELECT ?y WHERE { ?y :p _:u , _:u rdf:type :c }
>>would succeed with x bound to :a, but the corresponding query
>>SELECT ?y WHERE { ?y :p ?u , ?u rdf:type :c }
>>might rationally be said to fail; all when using OWL entailment. 
>>Admittedly, this case is controversial. One could argue that even 
>>in the second case, it would be sensible to require that the query 
>>engine provide a blank node identifier as an answer binding. But 
>>the working group felt that it would be prudent to leave the option 
>>open for future designers of OWL versions of SPARQL, which 
>>motivates keeping the blank-node/variable distinction in the syntax.
>One could argue as follows:  The entailed OWL graph (as shown above) 
>does include two triples that contain a blank-node (represented via 
>some label, shown as _:x here). So, for the second query above, why 
>shouldn't one generate a solution that substitutes the query 
>variable ?u to a blank-node (represented via some label, say :_x1)?

Well, indeed. I tend to agree with this - in fact, I believe that we 
should adopt as a general principle that if ASK succeeds with a blank 
node, then the corresponding SELECT ?x with the same pattern but with 
a variable should also succeed, possibly binding ?x to a blank node 
ID .  But FUB, who have the local expertise for OWL-DL querying, 
disagree: and certainly, it would be rather daunting to require OWL 
answering engines to create an inferred graph with ALL the possible 
existentials in it. I think its more a matter of preferred style than 
anything else: if one thinks of query variables as acting similarly 
to SQL, then it's natural to think of it binding to an ID actually in 
a dataset.

>Are we 'failing' the second query to limit the values for the 
>variables in the solution to the scoping set of original (i.e., 
>non-entailed) graph?

We would be if we did, but that is why the fully general definition 
doesn't have that restriction in it. It only restricts to a 'scoping 
set B' which isn't further specified in general, only for basic 
SPARQL. This allows B to have some extra stock of bnodeIDs when 
required for things like the OWL case.

>>Your next point is best addressed by discussing blank node scopes.
>>>  The only difference I can see is that _:a can not be
>>>placed in the SELECT list (and there does not appear to be any
>>>motivation for this).  Thus if the user, in the course of writing a
>>>query, later decided he wants to receive the value of the blank node,
>>>he must rewrite the query with a variable in place of the blank node.
>>>The user might as well just write the query without blank nodes from
>>>the beginning.
>>There really is no such thing in SPARQL as the 'value' of a query 
>>blank node. Blank node identifiers in queries are scoped to the 
>>query, and indicate an existential assertion.
>>In the course of checking the simple entailment relationship 
>>between the target graph and the pattern instance such a blank node 
>>must be 'mapped' to some term in the target graph, to be sure, but 
>>this mapping is distinct from the variable-to-binding instance 
>>mapping: it does not identify that term in any sense; rather, the 
>>presence of the mapped term simply confirms the truth of the 
>>existential claim made by the presence of the blank node. This also 
>>gets to your next point:
>>>In addition, the term "blank node" creates a false analogy with RDF.
>>>An RDF blank node is a node in a graph with no IRI.  A SPARQL blank node
>>>is not a node at all, it is actually a variable that cannot be named in
>>>the SELECT list.
>>We disagree. It is exactly an RDF blank node, and the analogy is 
>>not false. Do not think of a query bnode as a 'blank variable': 
>>think instead of the entire query basic graph pattern as an RDF 
>>graph with some 'named holes' in it, the query variables. The query 
>>answer is a vector of pieces of RDF syntax which, when 
>>syntactically substituted for the variables, produces (an 
>>appropriate lexicalization of ) an RDF graph which is simply 
>>entailed by the target graph[*].
>What if the pattern contains a blank-node in the predicate position? 
>Then the entailed instance is not a valid RDF graph according to 
>current restrictions in RDF which says predicates cannot be 
>blank-nodes. If we are allowing this in SPARQL, maybe we should 
>state this explicitly.

Yes, I think we should. Such a query cannot succeed at present, the 
freedom is there only to allow for future RDF loosenings, like 
allowing a literal in the subject position.

This was only added recently (at my suggestion), and I see now that 
it could be confusing; and unlike the literal-subject case, there 
isn't any prior W3C discussion we can refer to. Hmm, maybe we should 
quietly remove that bit of extra syntactic freedom, after all.

IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 27 January 2006 18:39:22 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:00:50 UTC