Re: Fw: No way to specify an RDF dataset of all the known named graphs from Seaborne, Andy on 2007-04-05 (public-rdf-dawg@w3.org from April to June 2007)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 05 Apr 2007 11:45:23 +0100
To: Lee Feigenbaum <feigenbl@us.ibm.com>
CC: public-rdf-dawg@w3.org
Message-ID: <4614D343.6020601@hp.com>
Lee Feigenbaum wrote:
> I'd like to know if anyone is motivated by Chimezie's comment suggesting 
> that a FROM * and a FROM NAMED * be added to SPARQL to "provide an 
> unambiguous way to specify a dataset which corresponds to all the known 
> named graphs." 
> 
> I'm wary of adding this for a couple of reasons:
> 
> 1/ I can't imagine how such a construct would be defined such that it was 
> any different from the implementation-defined state which currently exists 
> when FROM and FROM NAMED are omitted. (And, therefore, the construct 
> doesn't seem to add anything new or newly interoperable to the 
> specification.)
> 
> 2/ Existing implementations solve this problem within the current bounds 
> of SPARQL (see the IRC chat log cited for two examples)
> 
> 
> If you have a strong feeling one way or the other, please let it be known 
> so that I can gauge whether the group has consensus (and either reply to 
> Chimezie or slot this item on our teleconference agenda for next week).
> 
> Lee

I agree with Lee on both points (1) and (2) above.

I don't think it is a simple matter of changing FROM NAMED to include *.

Using * does not give consistency.  Indeed, of the ways to address this, such 
an indeterminate construct seems to be the wrong approach.  The next issue 
will be "what does FROM NAMED * resolve to?" (i.e.
SELECT DISTINCT ?g { GRAPH ?g {?s ?p ?o } })

The same query sent to a different places will give different answers - that's 
a feature of the service asked and dataset offered.  The idea of a "known 
universe" only exists as a fixed concept in some situations.   The number of 
graphs has to be fixed in time, else * means different things are different 
times without indication

As a web language the "known universe" is not a meaningful concept.  The fact 
that the application query has to name the named graphs is important. 
Otherwise, in the extreme, "FROM NAMED *" means all reachable URLs.

What we don't have is way of naming datasets - FROM/FROM NAMED provide a 
partial description of one.  SPARQL is more graph centric than dataset centric 
  and the implications of a theory of datasets goes way beyond DAWG.

I don't completely understand the example of the XML Query because there it is 
the documents in the store - the difference is merely "implicit all" (SPARQL) 
and "explicit, indeterminate all" (using *).

Note: The protocol is open to new parameters (HTTP) - so add domain specific 
indicators there.  After all, we have had a serious attempt at removing 
FROM/FROM NAMED from the query language altogether but the API needs and 
scripting put it back in again.

	Andy

> 
> 
> ----- Forwarded by Lee Feigenbaum/Cambridge/IBM on 04/05/2007 03:16 AM 
> -----
> 
> "Chimezie Ogbuji" <ogbujic@ccf.org> 
> Sent by: public-rdf-dawg-comments-request@w3.org
> 04/04/2007 05:05 PM
> Please respond to
> ogbujic@ccf.org
> 
> 
> To
> public-rdf-dawg-comments@w3.org
> cc
> 
> Subject
> No way to specify an RDF dataset of all the known named graphs
> 
> 
> 
> 
> 
> 
> 
> This was discussed in #swig
> (http://chatlogs.planetrdf.com/swig/2007-04-03.html#T20-38-01)
> 
> SPARQL currently does not provide an unambiguous way to specify a
> dataset which corresponds to all the known named graphs.  The only way
> this can be done is to leave out FROM <..> and FROM NAMED <..>
> directives in the prolog (and from the protocol, for SPARQL services).
> The corresponding dataset in this case depends on the host application -
> not very consistent. The only other alternative is to explicitly
> enumerate the known universe in the prolog:
> 
> FROM NAMED G1
> FROM NAMED G2
> ...
> FROM NAMED GN
> 
> This is not practical for a dynamic dataset.
> 
> There is plenty of value in querying against the known universe
> consistently especially for applications which make use of a dataset as
> a named graph partition that can grow indefinitely.  Consider XPath
> 2.0 / XQuery 1.0 which supports querying a collection of XML documents
> without having to explicitly enumerate all the XML documents in the
> collection.
> 
> This is a very useful 'database-wide' query pattern which is well
> supported in document-management languages but not supported in SPARQL
> without assuming the implementation will consistently supply the dataset
> corresponding to all the known named graphs in persistence in the
> absence of any dataset directives in the prolog or at the protocol
> level.
> 
> Other than OWA or CWA issues, I don't see why an explicit syntax for
> binding to such a dataset is not supported by SPARQL to provide a
> consistent way for applications to dispatch these kinds of queries.
> Such a syntax was suggested in the above conversation:
> 
> FROM NAMED *
> 
> [1] http://www.w3.org/TR/xquery-semantics/#sec_fn_doc_collection
> 

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Thursday, 5 April 2007 10:45:34 UTC