Re: [TF-ENT] Querying datasets with default plus named graphs

On 7 Oct 2009, at 13:46, Seaborne, Andy wrote:

>
>
>> -----Original Message-----
>> From: public-rdf-dawg-request@w3.org [mailto:public-rdf-dawg- 
>> request@w3.org]
>> On Behalf Of Birte Glimm
>> Sent: 07 October 2009 12:53
>> To: SPARQL Working Group
>> Subject: [TF-ENT] Querying datasets with default plus named graphs
>>
>> Hi all,
>> I skimmed the minutes of yesterday's telecon and I updated the
>> entailment doc to include the newly generated issues. I would like to
>> start collecting opinions for the issue of querying data sets that
>> have more than the default graph and whether inferences work on all
>> graphs in the datasets or are local to their particular graph. Here  
>> is
>> an example that Steve originally created:
>> We have a data set with the two named graphs http://example.org/a.rfd
>> and http://example.org/b.rdf (empty default graph).
>> http://example.org/a.rdf:
>>  :p rdfs:domain :A .
>> http://example.org/b.rdf:
>>  :x :p :y .
>
> Is anyone advocating this should be covered?
>
>>
>> The question is, what bindings ?g should take if we query:
>>  SELECT ?g WHERE { GRAPH ?g { :x a ?type .  } }
>>
>> If we assume that entailments always work over all graphs in the DS,
>> then ?type can be mapped to :A,

I can't see how that would be an entailment, as these graphs are  
separate.
FWIW, as announced in the last TC, I added an issue (ISSUE-43) to our  
tracker on that.

>> but this entailment depends on both
>> graphs. Taking any one out, means the entailment no longer holds, so
>> ?g must be both a.rdf and b.rdf and possibly the default graph since
>> there is no from clause in the query and we in fact query the default
>> graph. .
>>
>> Just to check that I get this right: If we take the same datat set  
>> and
>> issue the query
>>  SELECT ?o WHERE { :x :p ?o . }
>> I would get no answer under simple entailment because the default
>> graph is empty.
>
> Not quite - there is no dataset description so it will be whatever  
> the processor provides as the dataset (i.e. it's set externally -  
> common case).

I think Birte asks what should be the semantics, with that special  
data set given.

>> We have a data set with the two named graphs http://example.org/a.rfd
>> and http://example.org/b.rdf (empty default graph).
>> http://example.org/a.rdf:
>>  :p rdfs:domain :A .
>> http://example.org/b.rdf:
>>  :x :p :y .



>
>> If I ask
>>  SELECT ?o FROM NAMED <http://example.org/b.rdf> WHERE { :x :p ?o . }
>> I would get { (o, y) }, right?
>
> There is a dataset description, it does not mention the default  
> graph, so it is empty. So { :x :p ?o . } is on the empty graph and  
> does not match.
>
> { GRAPH <http://example.org/b.rdf> {:x :p ?o . } }
>
> returns { (?o, y) }
>
>> If I ask
>>  SELECT ?o FROM <http://example.org/b.rdf> WHERE { :x :p ?o . }
>> I would get { (o, y) } again, but this time I implicitly created a
>> default graph that contains all triples from b.rdf, right?
>
> Yes - although I'd say 'explicit' because you used FROM.
>
>> I guess
>> this default graph would be temporary, right and if I query again
>> without the from clause, I would again get no results, right?
>>
>> Ok, assuming I understand that right, I would much prefer to keep
>> entailments local to the graph.
>
> +1

+1 to keep entailments local to the separate  graphs in the DS
(<chairhatoff> although  I  personally consider it a drawback that you  
can't refer to ontologies from named graphs)

>
> And I believe this follows from "12.6 Extending SPARQL Basic Graph  
> Matching" which does not mention datasets.
>
> ----
>
> Mixed entailment regimes in one query do happen already.  I don't  
> see any sensible way to specify entailment across graphs and have a  
> mix.

agreed. My question in one of the earlier mails was actually whether  
there was any... I didn't see that either, that would be only possible  
with extending the notion of dataset, and I didn't so far sense  
support for this idea (would be a separate feature not on our list)....

>
> This is not to say that matching a BGP under entailment can't take  
> into account information not in the graph (presumably, rules  
> entailment do this anyway - the rules are not in the graph).  We  
> don't necessary need to make the T-Box visible do we?

... well, wouldn't the T-Box be a graph like any other? e.g. say <http://xmlns.com/foaf/spec/20071002.rdf 
 >
or, if not, what would be the machanism to refer to which ontologies/T- 
Boxes are taken into account?
Could that be part of the service description? e.g. "I am a sparql  
endpoint doing OWL entailment using the
Foaf ontology".

> Then "GRAPH <b.rdf> { :x a ?type .  }" works if <b.rdf> is set up in  
> some way (not part of the spec) to use the vocabulary in <a.rdf>.   
> The fact the information used for matching <b.rdf> happens to also  
> be accessible via <a.rdf> is neither here nor there.
>

If we'd leave that as part of SD, then it could go into the  
extensibility of SD possibly.


Axel


>> I think this goes well with SPARQL 1.0
>> because it says in Sec 8.1
>> (http://www.w3.org/TR/rdf-sparql-query/#exampleDatasets) below  
>> Example
>> 1: In this example, the default graph contains the names of the
>> publishers of two named graphs. The triples in the named graphs are
>> not visible in the default graph in this example.
>>
>> Let me also argue from an OWL viewpoint (because I am an OWL person):
>> I would see the IRIs in a FROM (NAMED) clause as ontology IRIs. An
>> ontology contains everything it needs and might use imports to  
>> include
>> resources that it does not physically contain. I have to load those
>> imported rsources anyway as part of the graph. As I understand it, an
>> implementor can now choose to have several ontologies loaded more or
>> less permanently as (named) graphs/ontologies (which means one can do
>> all preprocessing to them, check them for consistency, and possibly
>> classify them (build the sub-/superclass hierarchy), so that most
>> queries can be answered quickly). If I decide to have the pizza
>> ontology (often used for Protege tutorials) and Snomded (large  
>> medical
>> ontology) loaded as named graphs, then I do not want that pizzas have
>> any effect on my medical ontology and I do want entailments to be
>> local to the ontology. If users wants to merge two ontologies on the
>> fly for querying, they can ask
>> SELECT ?x FROM IRI_1, IRI_2 WHERE { some_BGP }
>> which would (according to Sec 8.2 of the SPARQL spec) result in the
>> query being valuated over a default graph that contains the RDF merge
>> of tuples from IRIR_1 and IRI_2.
>>
>> This would also allow for removing (named) graphs without having to  
>> do
>> soething like belief revision to find out what inferences are no
>> longer valid after the delete or having to reload and redo all
>> infrences for the remaining graphs.
>>
>> What would that mean for Steve's example? It has an empty answer, but
>> be no longer have to assign a.rf, b.rdf, and the default graph all
>> atthe same time to ?g.
>>
>> If there are no major objections, I can go and add a section about
>> data sets to the entailment doc similar to Sec 8 in the SPARQL doc,
>> which outlines how one can query a merge of resources and that
>> normally entailments are local to the graph. If you have  
>> objections, I
>> would be happy about suggestions for different ways of doing it.
>
> If it helps for clarity, then fine but it seems redundant to me once  
> 12.6 is referenced.
>
> 	Andy
> 	
>>
>> Cheers,
>> Birte

-- 
Dr. Axel Polleres
Digital Enterprise Research Institute, National University of Ireland,  
Galway
email: axel.polleres@deri.org  url: http://www.polleres.net/

Received on Wednesday, 7 October 2009 22:55:13 UTC