W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2009

RE: [TF-ENT] Querying datasets with default plus named graphs

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Wed, 7 Oct 2009 12:46:56 +0000
To: Birte Glimm <birte.glimm@comlab.ox.ac.uk>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <B6CF1054FDC8B845BF93A6645D19BEA3693EDB329B@GVW1118EXC.americas.hpqcorp.net>


> -----Original Message-----
> From: public-rdf-dawg-request@w3.org [mailto:public-rdf-dawg-request@w3.org]
> On Behalf Of Birte Glimm
> Sent: 07 October 2009 12:53
> To: SPARQL Working Group
> Subject: [TF-ENT] Querying datasets with default plus named graphs
> 
> Hi all,
> I skimmed the minutes of yesterday's telecon and I updated the
> entailment doc to include the newly generated issues. I would like to
> start collecting opinions for the issue of querying data sets that
> have more than the default graph and whether inferences work on all
> graphs in the datasets or are local to their particular graph. Here is
> an example that Steve originally created:
> We have a data set with the two named graphs http://example.org/a.rfd

> and http://example.org/b.rdf (empty default graph).
> http://example.org/a.rdf:

>   :p rdfs:domain :A .
> http://example.org/b.rdf:

>   :x :p :y .

Is anyone advocating this should be covered?

> 
> The question is, what bindings ?g should take if we query:
>   SELECT ?g WHERE { GRAPH ?g { :x a ?type .  } }
> 
> If we assume that entailments always work over all graphs in the DS,
> then ?type can be mapped to :A, but this entailment depends on both
> graphs. Taking any one out, means the entailment no longer holds, so
> ?g must be both a.rdf and b.rdf and possibly the default graph since
> there is no from clause in the query and we in fact query the default
> graph. .
> 
> Just to check that I get this right: If we take the same datat set and
> issue the query
>   SELECT ?o WHERE { :x :p ?o . }
> I would get no answer under simple entailment because the default
> graph is empty.

Not quite - there is no dataset description so it will be whatever the processor provides as the dataset (i.e. it's set externally - common case).
 
> If I ask
>   SELECT ?o FROM NAMED <http://example.org/b.rdf> WHERE { :x :p ?o . }
> I would get { (o, y) }, right?

There is a dataset description, it does not mention the default graph, so it is empty. So { :x :p ?o . } is on the empty graph and does not match.

{ GRAPH <http://example.org/b.rdf> {:x :p ?o . } }

returns { (?o, y) }

> If I ask
>   SELECT ?o FROM <http://example.org/b.rdf> WHERE { :x :p ?o . }
> I would get { (o, y) } again, but this time I implicitly created a
> default graph that contains all triples from b.rdf, right? 

Yes - although I'd say 'explicit' because you used FROM.

> I guess
> this default graph would be temporary, right and if I query again
> without the from clause, I would again get no results, right?
> 
> Ok, assuming I understand that right, I would much prefer to keep
> entailments local to the graph.

+1

And I believe this follows from "12.6 Extending SPARQL Basic Graph Matching" which does not mention datasets.

----

Mixed entailment regimes in one query do happen already.  I don't see any sensible way to specify entailment across graphs and have a mix.

This is not to say that matching a BGP under entailment can't take into account information not in the graph (presumably, rules entailment do this anyway - the rules are not in the graph).  We don't necessary need to make the T-Box visible do we?  Then "GRAPH <b.rdf> { :x a ?type .  }" works if <b.rdf> is set up in some way (not part of the spec) to use the vocabulary in <a.rdf>.  The fact the information used for matching <b.rdf> happens to also be accessible via <a.rdf> is neither here nor there.

> I think this goes well with SPARQL 1.0
> because it says in Sec 8.1
> (http://www.w3.org/TR/rdf-sparql-query/#exampleDatasets) below Example
> 1: In this example, the default graph contains the names of the
> publishers of two named graphs. The triples in the named graphs are
> not visible in the default graph in this example.
> 
> Let me also argue from an OWL viewpoint (because I am an OWL person):
> I would see the IRIs in a FROM (NAMED) clause as ontology IRIs. An
> ontology contains everything it needs and might use imports to include
> resources that it does not physically contain. I have to load those
> imported rsources anyway as part of the graph. As I understand it, an
> implementor can now choose to have several ontologies loaded more or
> less permanently as (named) graphs/ontologies (which means one can do
> all preprocessing to them, check them for consistency, and possibly
> classify them (build the sub-/superclass hierarchy), so that most
> queries can be answered quickly). If I decide to have the pizza
> ontology (often used for Protege tutorials) and Snomded (large medical
> ontology) loaded as named graphs, then I do not want that pizzas have
> any effect on my medical ontology and I do want entailments to be
> local to the ontology. If users wants to merge two ontologies on the
> fly for querying, they can ask
> SELECT ?x FROM IRI_1, IRI_2 WHERE { some_BGP }
> which would (according to Sec 8.2 of the SPARQL spec) result in the
> query being valuated over a default graph that contains the RDF merge
> of tuples from IRIR_1 and IRI_2.
> 
> This would also allow for removing (named) graphs without having to do
> soething like belief revision to find out what inferences are no
> longer valid after the delete or having to reload and redo all
> infrences for the remaining graphs.
> 
> What would that mean for Steve's example? It has an empty answer, but
> be no longer have to assign a.rf, b.rdf, and the default graph all
> atthe same time to ?g.
> 
> If there are no major objections, I can go and add a section about
> data sets to the entailment doc similar to Sec 8 in the SPARQL doc,
> which outlines how one can query a merge of resources and that
> normally entailments are local to the graph. If you have objections, I
> would be happy about suggestions for different ways of doing it.

If it helps for clarity, then fine but it seems redundant to me once 12.6 is referenced.

	Andy
	
> 
> Cheers,
> Birte
Received on Wednesday, 7 October 2009 12:47:52 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:40 GMT