- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Sun, 09 Sep 2012 12:08:02 +0100
- To: public-sparql-dev@w3.org
On 07/09/12 15:37, Lee Feigenbaum wrote: > [moved to public-sparql-dev] > > I have a related question -- do all quad stores / named graph stores > include a default graph? If the store that you develop or use does have > a default graph, does that graph also have a name (URI)? TDB has a real, stored default graph or can operate in union-named-graph mode. The storage default graph can have be accessed by a URI name but: 1/ The name is the same in all stores so it is not a well-formed 2/ It does not show up in GRAPH ?g {} The union of all named graphs also has a name so a query accessing the usual default graph as normal can also GRAPH to get the union. But: The most common use case is storing and querying a single graph, named graphs just confuse the issue a lot of the time :-) Andy > > Answering for Anzo: Anzo does not have a default graph. All graphs are > named with URIs. > > Lee > > On 9/7/2012 7:44 AM, Barry Bishop wrote: >> Hello Axel, >> >> On 05/09/12 21:14, Polleres, Axel wrote: >>> Thanks Barry, >>> >>> Since you confirm that the response addresses your comment, please >>> consider this reply informal (chair-hat off). >>> >>>> I feel this is a shame, as two different implementations can >>>> produce different output from the simplest of queries, e.g. >>>> SELECT * { ?s ?p ?o } >>> I personally find this quite normal... different endpoints >>> respond differently to such query since they refer to different >>> default datasets, i.e. >>> Naturally when I query dbpedia.org I qury a different dataset than >>> data.semanticweb.org, etc. >> >> Well, dbpedia.org and data.semanticweb.org sparql endpoints make >> different data available, so I suppose you would naturally get >> different results to the same query. However, this is not what I was >> getting at. In fact, I'm not sure I have managed to get my point >> across at all. Perhaps another hypothetical example: >> >> Suppose you run a development team that builds an application that >> interacts with some public sparql endpoint, say http://xyz.org/sparql >> - then one day xyz.org start to have scalability problems and decide >> to upgrade their RDF database to some expensive new thing. Both old >> and new RDF databases are fully compliant with W3C, but after they >> upgrade your application is completely broken only because the two >> database implementations construct their RDF dataset differently when >> no FROM clauses are given. I am sure you wouldn't find it so natural >> in this case. >> >> There are some workarounds as you say, but not in all cases. When you >> are using someone else's database and don't get to decide how they >> partition their data in to separate graphs, then you can be completely >> stuck. As fabulous as the query language is (and I do think it is >> tremendous achievement), this ambiguity over constructing a dataset >> when there are no FROMs is a bit of a hole. >> >>> >>> Notably, I'd like to also point you to the another document within >>> the SPARQL1.1 specification, >>> i.e. the service-description document at >>> http://www.w3.org/TR/sparql11-service-description/ >>> which provides means to describe which graphs compose the default >>> dataset of a particular service endpoint. >>> Particularly, the property >>> http://www.w3.org/TR/sparql11-service-description/#sd-defaultDataset >>> is intended to provide a description of the default dataset that an >>> endpoint uses. >>> Note also that the service desription voaculary is extensible, and >>> what we specify now is only a core, but other vocabulary can be used >>> to extend this (e.g. VoID) >> >> All well and good, if this feature is actually provided by an >> endpoint. However, it requires quite a lot of programming for a client >> to work all this out and re-write queries accordingly. And actually, >> it still doesn't help - e.g. if the endpoint you want to use >> constructs the dataset as an RDF merge of all graphs (when no FROM >> clauses are given [I need to find an abbreviation for this]) and you >> only want to query the default graph, then you just can't do it. There >> is no way to tell such an endpoint that you only want the default >> graph using the query language. >> >> The problem is basically that the default graph is special - because >> it doesn't have an identifier it can not be used in the same way as >> named graphs.... >> >> ... in the query language. However, in the update language the >> appropriate syntax has already been created and would be the perfect >> complement to the query language, e.g. if I can do this: >> >> CLEAR DEFAULT >> >> why can't I do this: >> >> SELECT * >> FROM DEFAULT >> {...} >> >> and specify absolutely unambiguously that I want my query to execute >> *only* over the default graph in the database. No matter how an >> implementation constructs its dataset when no FROM clauses are given, >> this syntax should always work in the expected way. >> >> Since I am rambling on, the related keywords from the update language >> would also be very useful, e.g. one can clear all graphs like this: >> >> CLEAR ALL >> >> so why not be able to do this: >> >> SELECT * >> FROM ALL >> {...} >> >> This would help in the opposite case, when an implementation >> constructs the dataset using only the default graph (when no FROM >> clauses are given). In this situation, it is not possible to query for >> the graph names (using select distinct ?g {graph ?g {?s ?p ?o}}), so >> the above would say: "please merge all graphs for input to my query, >> even though I don't know what their names are and have no way of >> finding out (using the query language)". >> >> These things might not seem important, but they are life and death to >> application programmers. Right now, to build an application that needs >> to interact with a sparql endpoint that is only known at runtime is >> fraught with difficulties. Not the least of which is that if your >> application is required to query data only from the default graph, >> then there is no way to write a query that is guaranteed to do this on >> all (W3C compliant) sparql endpoints. >> >> Which I still feel is a bit of a shame. >> >> barry >> >> >>> >>> As for the rest of your response, we seem to agree that what you're >>> aiming at >>> is rather a new feature than something this working group can address >>> within its current >>> charter and resources. >>> >>> Best regards, >>> Axel >>> >>>> -----Original Message----- >>>> From: Barry Bishop [mailto:barry.bishop@ontotext.com] >>>> Sent: Mittwoch, 05. September 2012 19:49 >>>> To: Polleres, Axel >>>> Cc: public-rdf-dawg-comments@w3.org >>>> Subject: Re: Querying only the default graph from the data store >>>> >>>> Hello Axel, >>>> >>>> Thanks for taking the time to reply. I realise this thread is >>>> somewhat out of place given the status/progress of the WG. >>>> >>>> Your reply does address my initial post. It does not resolve >>>> it, but this is perhaps not the time. However, for the >>>> purpose of clarity I will make further comments inline: >>>> >>>> On 05/09/12 04:11, Polleres, Axel wrote: >>>>> Hi Barry, >>>>> >>>>> This is in response to >>>>> >>>> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2012Aug/0 >>>>> 011.html >>>>> >>>>>> The working draft does not specify how the RDF dataset is >>>> constructed >>>>>> when no FROM and FROM NAMED clauses are present in the >>>> SPARQL query. >>>>>> Implementations are therefore able to construct the dataset >>>>>> differently, e.g. >>>>>> a. dataset default graph contains only the data store's >>>> default graph >>>>>> b. dataset default graph contains the RDF merge of all >>>> graphs in the >>>>>> data store >>>>> It is correct that how the concrete default dataset of a >>>> SPARQL endpoint is conctructed is left open to >>>> implementations. Since different endpoints and >>>> implementations support different behaviours in this regard >>>> (e.g. in some implementations the default graph of the >>>> default dataset is the union of all named graphs whereas in >>>> others this is not the case), the working group does not feel >>>> that there is a unique standard behavior to be advocated this >>>> time around. >>>> >>>> I feel this is a shame, as two different implementations can >>>> produce different output from the simplest of queries, e.g. >>>> SELECT * { ?s ?p ?o } >>>> >>>> However, this is a separate issue. >>>> >>>>>> As soon as a single FROM or FROM NAMED clause is used then >>>> the data >>>>>> store's default graph is excluded from the query's dataset. >>>>>> >>>>>> Which means that there is no portable way to defne a >>>> SPARQL query so >>>>>> that it executes only against the default graph in the >>>> data store - >>>>>> or even against a combination of the default graph and one or more >>>>>> named graphs. >>>>> Please note that a) querying the default graph in the >>>> datastore is the standard behavior when no explicit FROM or >>>> FROM NAMED clauses are given. b) the combination of querying >>>> named graphs and the default graph of the endpoint's default >>>> dataset is supported via GRAPH graph patterns. >>>> >>>> a) This is rather inconsistent. Above you say that the >>>> construction of the default RDF dataset (when no FROM/FROM >>>> NAMED clauses are given) is not defined, but here you say >>>> constructing it using the default graph only is the 'standard >>>> behaviour'. One of the motivations for this post is that >>>> there are good reasons not to have only the default graph in >>>> the 'default dataset', e.g. you wouldn't be able to do this >>>> to find out the graph names when presented with an unknown endpoint: >>>> >>>> SELECT DISTINCT ?g WHERE { GRAPH ?g {?s ?p ?o } } >>>> >>>> Anyway, the point here is that there is no *portable* way to >>>> query just the default graph. >>>> >>>> b) yes, but you can't query the RDF merge of the default >>>> graph and a named graph in the same way with two named >>>> graphs, e.g. FROM ex:g1 FROM ex:g2. Instead one would need to >>>> use a triple and graph pattern union, which for complex >>>> queries becomes cumbersome. Put another way, any combination >>>> of named graphs can be merged and explored with query triple >>>> patterns, but this can't be done with any combination of >>>> named graphs and the default graph. >>>> >>>> >>>>> See also examples below. >>>>> >>>>>> This is a problem that often confuses users of RDF data >>>> stores and is >>>>>> likely to lead to implementations that provide their own specific >>>>>> means to achieve this, e.g. >>>>>> http://www.openrdf.org/issues/browse/SES-850 >>>>>> >>>>>> Inspired by the update language's use of the 'DEFAULT' keyword for >>>>>> graph manipulation, I suggest an extension to the query >>>> language that >>>>>> allows "FROM DEFAULT" to be used, e.g. >>>>>> >>>>>> SELECT * >>>>>> FROM DEFAULT >>>>>> WHERE { ..... } >>>>>> >>>>>> => dataset contains a default graph made up of the data store's >>>>>> default graph only >>>>> Please note that this the standard behaviour when no FROM clause is >>>>> given, i.e. this corresponds to >>>>> >>>>> SELECT * >>>>> WHERE { ..... } <--- (no use of GRAPH keyword) >>>> I don't think this is "standard behaviour", rather it is >>>> common behaviour. It can not be standard when the >>>> construction of the dataset is implementation dependent when >>>> no FROM clause is given. >>>> >>>>>> This construct can be used with any number of FROM <uri> >>>> or FROM NAMED >>>>>> <uri> clauses, e.g. >>>>>> >>>>>> SELECT * >>>>>> FROM DEFAULT >>>>>> FROM <http://example.com#g1> >>>>>> WHERE { ..... } >>>>>> >>>>>> => dataset contains a default graph made up of the data >>>> store's default >>>>>> graph merged with the contents of the data store's g1 graph >>>>>> This would be a fairly trivial change for exisiting sparql >>>> processor >>>>>> implementations, but would provide a big improvement in >>>>>> functionality/flexibility by allowing a data store's >>>> default graph to be >>>>>> used/queried/merged in the same way as any of it's named graphs. >>>>> Note that similar to the example above, you can query the >>>> default graph and named graphs within the default dataset in >>>> a data store side by side by using GRAPH graph patterns, i.e. >>>>> SELECT * >>>>> WHERE >>>>> { >>>>> ..... <-- (no use of >>>> GRAPH) matches the default graph >>>>> GRAPH <http://ex.com#g1> { .... } <-- matches named >>>> graph g1 (assuming g1 is a named graph in the default dataset) >>>>> } >>>> Consider an application that needs to execute queries over various >>>> subsets of a database's contents, where the subsets are defined using >>>> various combinations of named graphs. It would certainly be useful to >>>> have standard queries which only required the appropriate >>>> "FROM g1 FROM >>>> g2 etc" prepended. This is easy to do, unless one of the >>>> graphs is the >>>> default graph. >>>> >>>>> Finally, note that it is not possible in SPARQL1.1 to >>>> construct a *new* dataset composed of *parts* of the default >>>> dataset of an endpoint plus possible external graphs; such a >>>> feature currently not foreseen in the features addressed in >>>> this round of SPARQL, but had been suggested before [1]. >>>>> The features being worked on in this round of >>>> standardization have been decided in a voting process at the >>>> beginning of the WG and are documented in the following >>>> document: http://www.w3.org/TR/sparql-features/ >>>>> Additionally, a list of work items and features postponed >>>> to a future working group are being collected by the group in >>>> a dedicated wiki page [2] which also contains the features >>>> discussed in the beginning of the WG which have not been >>>> considered for this round [3]. >>>> >>>> Yes, I will be more timely next time and will endeavour to >>>> progress this >>>> topic in the proper way. My apologies for the 'noise'. >>>> >>>> Regards, >>>> barry >>>> >>>>> Among this list, the feature "Composite Datasets" [1] might >>>> partially capture what you have in mind and a future WG might >>>> possibly work out the details of such feature. >>>>> We'd kindly ask you to confirm by a reply to this list that >>>> this addresses your comment. >>>>> Axel Polleres, on behalf of the SPARQL WG >>>>> >>>>> 1. http://www.w3.org/2009/sparql/wiki/Feature:CompositeDatasets >>>>> 2. http://www.w3.org/2009/sparql/wiki/Future_Work_Items >>>>> 3. http://www.w3.org/2009/sparql/wiki/Category:Features >>>> >> >> >> > >
Received on Sunday, 9 September 2012 11:08:33 UTC