Re: subgraph/entailment from Bijan Parsia on 2005-09-07 (public-rdf-dawg@w3.org from July to September 2005)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Wed, 7 Sep 2005 10:08:21 -0400
To: Dan Connolly <connolly@w3.org>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <0a6d6da40dd26bf9073abf79d0d35e23@isr.umd.edu>
On Sep 7, 2005, at 9:43 AM, Dan Connolly wrote:

> On Wed, 2005-09-07 at 09:27 -0400, Bijan Parsia wrote:
>> On Sep 7, 2005, at 9:19 AM, Dan Connolly wrote:
[snip]
>> This is a contradiction (I think we have a terminology conflict). The
>> contribution of more expressive logics cannot be asserted triples. By
>> "asserted" I meant, "asserted in the original document/dataset" not
>> "asserted 'by inference'".
>
> Er... "original" is a distinction that's not visible to SPARQL QL.

I understand the intent here, but that's not well specified (to me at 
least) in the current document. Including how this is supposed to 
extend for more expressive languages (obviously I get the basic notion, 
but I think we'll save a lot of grief is we're clearer on that point).

> In SPARQL QL, you start with some RDF dataset (i.e. a bunch of
> graphs). How you got there is your business, but we expect
> one of the popular ways is by grabbing data off the web and
> computing, say, the RDFS closure of it.

Do you find Enrico's point that equivalent graphs can return different 
sets of hits interesting or worth considering? Does the WG have any 
advice for dealing with RDF entailment based closure (which is always 
infinite and for obvious and useful and used queries like ?p rdf:type 
rdf:Property will return infinite results?) Actually, I think if SPARQL 
as written can't properly or practically handle RDF entailment then it 
is broken. Similarly for RDFS entailment. It should AT LEAST get those 
right, or it should make clear that it can only (practically speaking) 
handle "base graphs" until it is clear how to handle these other 
situations.

I guess you could let chips fall and say, "hey, rdf and rdfs entailment 
per se are hard to deal with...suck it up".

Sorry to have not banged on this bit sooner, but I hadn't really 
noticed it before.

> The specification of SPARQL QL and our test harness and such
> start there, and tell you, given a query, what the results are.

Which tests deal with graphs under RDF semantics?
[snip]
>> If the way around this is to do some sort of closure and then "dump"
>> the data (roughly) and reload it..well....now we're requiring
>> extrasilly gyrations to kill clarity and avoid some important details.
>
> You don't have to dump it anywhere; you don't even have to
> pre-compute the whole thing; you can do the query backward-chaining
> style, if you like.

I meant conceptually. In any case, I think one throw away line in the 
first not clearly normative paragraph of the document doesn't adquately 
explain this design. So i'm back to wanting a clearly specified design. 
One criterion I have on that design is that it doesn't preclude 
extension to OWL. I've argued that pat's approach *can* so extend, 
though there may be some hairy bits and it is v. non-standard and 
confusing (thus really needs some serious attention in the document). A 
second criterion is that it's clear what interoperable (same answers) 
implemenations do for graphs closed under RDF entailment  (and 
preferably, RDFS entailment; getting one should make the other clear). 
Another criterion is that it be practical for realistic use cases. The 
second and third have some tension. Some notion of minimal or 
non-redundant or non-silly results would be helpful.

My test case is:

select ?p where {?p rdf:type rdf:Property}

against an empty dataset. (Against an arbitrary dataset, I would expect 
to get all the properties mentioned in that dataset ++ the ones 
stemming from the axiomatic triples; I might prefer only the inferred 
ones without the axiomatic triples).

Under rdf semantics, the answer should always include rdf:type rdf:type 
rdf:Property. The answer set should also be infinite. This doesn't seem 
to be the most useful situation although it's the most "naively 
correct" under the current approach.

Cheers,
Bijan.
Received on Wednesday, 7 September 2005 14:08:27 UTC