Re: Querying "all graphs" from Chimezie Ogbuji on 2009-03-30 (public-rdf-dawg@w3.org from January to March 2009)

From: Chimezie Ogbuji <ogbujic@ccf.org>
Date: Mon, 30 Mar 2009 19:23:30 -0400
To: "Seaborne, Andy" <andy.seaborne@hp.com>, "Lee Feigenbaum" <lee@thefigtrees.net>
cc: "Kjetil Kjernsmo" <Kjetil.Kjernsmo@computas.com>, "'RDF Data Access Working Group'" <public-rdf-dawg@w3.org>
Message-ID: <C5F6CCB2.905F%ogbujic@ccf.org>
I've created two features:

- http://www.w3.org/2009/sparql/wiki/Feature:CompositeDatasets
- http://www.w3.org/2009/sparql/wiki/Feature:AllNamedGraphs

CompositeDatasets is meant to be a comprehensive solution for composing
named, arbitrary subsets from all named graphs in a dataset, while
AllNamedGraphs is supposed to be 'low hanging fruit' (i.e., the feature that
there seems to be more consensus need around and corresponds to Lee's #1).

Comments below

On 3/30/09 12:40 PM, "Seaborne, Andy" <andy.seaborne@hp.com> wrote:
>> That would be great - reading below, I think we're still not
>> communicating well with each other, so rather than keep at it, I'd like
>> to see this feature request articulated so I can understand it better.

Hopefully, the wikis above do this request some justice.

>> For now, I see 3 potential distinct features here:
>> 1/ A way for users to explicitly write a query that defines the default
>> graph component of the RDF dataset as comprising "the RDF merge of all
>> graphs that the SPARQL engine knows about". This is what's been referred
>> to at times in the past as "FROM *", and is the one that I am
>> sympathetic too but have trouble imagining how it would be specified.

This would be Feature:AllNamedGraphs
 
>> 2/ A way for users to explicitly specify that the named graphs component
>> of the RDF dataset should be merged and used as the default graph of the
>> RDF dataset. I don't really understand what this would gain, since to
>> get to this point you'd already have needed to somehow specify (e.g. via
>> FROM NAMED) the relevant graphs that should be named graphs in the RDF
>> dataset, and you can use that same mechanism (e.g. FROM) to stick those
>> graphs' contents into the default graph part of the RDF dataset.

This is not what I had in mind and I'm not sure what this would gain either.

>> 3/ A way for users to refer to RDF datasets by name. I wrote about how
>> we deal with this in Open Anzo (via "named datasets") here:
>> http://www.thefigtrees.net/lee/blog/2009/03/named_graphs_in_open_anzo.html
>> I'm pretty happy with this approach but don't personally think it's ripe
>> for standardization.

I believe what was suggested in Feature:CompositeDatasets is very similar
except it provides a query-scoped mechanism for "named aggregate graphs".
So it is a bit more granular in the sense that the dataset can be composed
using IRIs that are associated with the RDF merge of certain named graphs in
the original dataset.

> A couple of Jena stores provide the feature of providing the RDF merge of all
> the named graphs in the dataset.  This can be accessed via a URI (there's no
> reason why a graph can't be in the dataset in different ways under different
> names) or the engine can be told to make default graph the RDF merge of named
> graphs.  This is a property of the SPARQL service being offered.

However, without a ratified way for users to inform the service how to
'compose' these URIs, this implementation space differs from system to
system.  

> I'm not sure this is the best way to do it - it is a way to do with without
> needing to change SPARQL.

I agree that if we allow the service to completely handle the way graphs are
associated to URIs (through some well-specified mechanism) we can rely on
the use of 'URIs as proxy for aggregate graphs' alone without changing the
query language.  However, without specifying how services can be instructed
(by the user) to do this, we risk an interoperability nightmare (if there is
much need for this in the wild)

----------------------
Chimezie (chee-meh) Thomas-Ogbuji (oh-bu-gee)
Heart and Vascular Institute (Clinical Investigations)
Cleveland Clinic (ogbujic@ccf.org)
Ph.D. Student Case Western Reserve University
(chimezie.thomas-ogbuji@case.edu)


===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.
Received on Monday, 30 March 2009 23:24:24 UTC