W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > September 2012

RE: Querying only the default graph from the data store

From: Polleres, Axel <axel.polleres@siemens.com>
Date: Wed, 5 Sep 2012 21:14:55 +0200
To: "barry.bishop@ontotext.com" <barry.bishop@ontotext.com>
CC: "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
Message-ID: <9DA51FFE5E84464082D7A089342DEEE801463B708196@ATVIES9917WMSX.ww300.siemens.net>
Thanks Barry,

Since you confirm that the response addresses your comment, please consider this reply informal (chair-hat off).

> I feel this is a shame, as two different implementations can
> produce different output from the simplest of queries, e.g.
> SELECT * { ?s ?p ?o }

I personally find this quite normal... different endpoints
respond differently to such query since they refer to different default datasets, i.e.
Naturally when I query dbpedia.org I qury a different dataset than data.semanticweb.org, etc.

Notably, I'd like to also point you to the another document within the SPARQL1.1 specification,
i.e. the service-description document at
http://www.w3.org/TR/sparql11-service-description/
which provides means to describe which graphs compose the default
dataset of a particular service endpoint.
Particularly, the property
 http://www.w3.org/TR/sparql11-service-description/#sd-defaultDataset
is intended to provide a description of the default dataset that an endpoint uses.
Note also that the service desription voaculary is extensible, and what we specify now is only a core, but other vocabulary can be used to extend this (e.g. VoID)

As for the rest of your response, we seem to agree that what you're aiming at
is rather a new feature than something this working group can address within its current
charter and resources.

Best regards,
Axel

> -----Original Message-----
> From: Barry Bishop [mailto:barry.bishop@ontotext.com]
> Sent: Mittwoch, 05. September 2012 19:49
> To: Polleres, Axel
> Cc: public-rdf-dawg-comments@w3.org
> Subject: Re: Querying only the default graph from the data store
>
> Hello Axel,
>
> Thanks for taking the time to reply. I realise this thread is
> somewhat out of place given the status/progress of the WG.
>
> Your reply does address my initial post. It does not resolve
> it, but this is perhaps not the time. However, for the
> purpose of clarity I will make further comments inline:
>
> On 05/09/12 04:11, Polleres, Axel wrote:
> > Hi Barry,
> >
> > This is in response to
> >
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2012Aug/0
> > 011.html
> >
> >> The working draft does not specify how the RDF dataset is
> constructed
> >> when no FROM and FROM NAMED clauses are present in the
> SPARQL query.
> >>
> >> Implementations are therefore able to construct the dataset
> >> differently, e.g.
> >> a. dataset default graph contains only the data store's
> default graph
> >> b. dataset default graph contains the RDF merge of all
> graphs in the
> >> data store
> > It is correct that how the concrete default dataset of a
> SPARQL endpoint is conctructed is left open to
> implementations. Since different endpoints and
> implementations support different behaviours in this regard
> (e.g. in some implementations the default graph of the
> default dataset is the union of all named graphs whereas in
> others this is not the case), the working group does not feel
> that there is a unique standard behavior to be advocated this
> time around.
>
> I feel this is a shame, as two different implementations can
> produce different output from the simplest of queries, e.g.
> SELECT * { ?s ?p ?o }
>
> However, this is a separate issue.
>
> >
> >> As soon as a single FROM or FROM NAMED clause is used then
> the data
> >> store's default graph is excluded from the query's dataset.
> >>
> >> Which means that there is no portable way to defne a
> SPARQL query so
> >> that it executes only against the default graph in the
> data store -
> >> or even against a combination of the default graph and one or more
> >> named graphs.
> > Please note that a) querying the default graph in the
> datastore is the standard behavior when no explicit FROM or
> FROM NAMED clauses are given. b) the combination of querying
> named graphs and the default graph of the endpoint's default
> dataset is supported via GRAPH graph patterns.
>
> a) This is rather inconsistent. Above you say that the
> construction of the default RDF dataset (when no FROM/FROM
> NAMED clauses are given) is not defined, but here you say
> constructing it using the default graph only is the 'standard
> behaviour'. One of the motivations for this post is that
> there are good reasons not to have only the default graph in
> the 'default dataset', e.g. you wouldn't be able to do this
> to find out the graph names when presented with an unknown endpoint:
>
> SELECT DISTINCT ?g WHERE { GRAPH ?g {?s ?p ?o } }
>
> Anyway, the point here is that there is no *portable* way to
> query just the default graph.
>
> b) yes, but you can't query the RDF merge of the default
> graph and a named graph in the same way with two named
> graphs, e.g. FROM ex:g1 FROM ex:g2. Instead one would need to
> use a triple and graph pattern union, which for complex
> queries becomes cumbersome. Put another way, any combination
> of named graphs can be merged and explored with query triple
> patterns, but this can't be done with any combination of
> named graphs and the default graph.
>
>
> >
> > See also examples below.
> >
> >> This is a problem that often confuses users of RDF data
> stores and is
> >> likely to lead to implementations that provide their own specific
> >> means to achieve this, e.g.
> >> http://www.openrdf.org/issues/browse/SES-850
> >>
> >> Inspired by the update language's use of the 'DEFAULT' keyword for
> >> graph manipulation, I suggest an extension to the query
> language that
> >> allows "FROM DEFAULT" to be used, e.g.
> >>
> >> SELECT *
> >> FROM DEFAULT
> >> WHERE { ..... }
> >>
> >> => dataset contains a default graph made up of the data store's
> >> default graph only
> > Please note that this the standard behaviour when no FROM clause is
> > given, i.e. this corresponds to
> >
> > SELECT *
> > WHERE { ..... }       <--- (no use of GRAPH keyword)
>
> I don't think this is "standard behaviour", rather it is
> common behaviour. It can not be standard when the
> construction of the dataset is implementation dependent when
> no FROM clause is given.
>
> >
> >> This construct can be used with any number of FROM <uri>
> or FROM NAMED
> >> <uri> clauses, e.g.
> >>
> >> SELECT *
> >> FROM DEFAULT
> >> FROM <http://example.com#g1>
> >> WHERE { ..... }
> >>
> >> => dataset contains a default graph made up of the data
> store's default
> >> graph merged with the contents of the data store's g1 graph
> >> This would be a fairly trivial change for exisiting sparql
> processor
> >> implementations, but would provide a big improvement in
> >> functionality/flexibility by allowing a data store's
> default graph to be
> >> used/queried/merged in the same way as any of it's named graphs.
> > Note that similar to the example above, you can query the
> default graph and named graphs within the default dataset in
> a data store side by side by using GRAPH graph patterns, i.e.
> >
> >   SELECT *
> >   WHERE
> >   {
> >     .....                              <-- (no use of
> GRAPH) matches the default graph
> >     GRAPH <http://ex.com#g1> { .... }  <-- matches named
> graph g1 (assuming g1 is a named graph in the default dataset)
> >   }
>
> Consider an application that needs to execute queries over various
> subsets of a database's contents, where the subsets are defined using
> various combinations of named graphs. It would certainly be useful to
> have standard queries which only required the appropriate
> "FROM g1 FROM
> g2 etc" prepended. This is easy to do, unless one of the
> graphs is the
> default graph.
>
> >
> > Finally, note that it is not possible in SPARQL1.1 to
> construct a *new* dataset composed of *parts* of the default
> dataset of an endpoint plus possible external graphs; such a
> feature currently not foreseen in the features addressed in
> this round of SPARQL, but had been suggested before [1].
> >
> > The features being worked on in this round of
> standardization have been decided in a voting process at the
> beginning of the WG and are documented in the following
> document: http://www.w3.org/TR/sparql-features/
> >
> > Additionally, a list of work items and features postponed
> to a future working group are being collected by the group in
> a dedicated wiki page [2] which also contains the features
> discussed in the beginning of the WG which have not been
> considered for this round [3].
>
> Yes, I will be more timely next time and will endeavour to
> progress this
> topic in the proper way. My apologies for the 'noise'.
>
> Regards,
> barry
>
> >
> > Among this list, the feature "Composite Datasets" [1] might
> partially capture what you have in mind and a future WG might
> possibly work out the details of such feature.
> >
> > We'd kindly ask you to confirm by a reply to this list that
> this addresses your comment.
> >
> > Axel Polleres, on behalf of the SPARQL WG
> >
> > 1. http://www.w3.org/2009/sparql/wiki/Feature:CompositeDatasets
> > 2. http://www.w3.org/2009/sparql/wiki/Future_Work_Items
> > 3. http://www.w3.org/2009/sparql/wiki/Category:Features
>
>
Received on Wednesday, 5 September 2012 19:15:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 5 September 2012 19:15:27 GMT