Re: FROM and FROM NAMED: To fetch or not to fetch? from David Booth on 2012-06-22 (public-rdf-dawg-comments@w3.org from June 2012)

From: David Booth <david@dbooth.org>
Date: Fri, 22 Jun 2012 12:24:00 -0400
To: Gregory Williams <greg@evilfunhouse.com>, Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-dawg-comments <public-rdf-dawg-comments@w3.org>
Message-ID: <1340382240.6980.19038.camel@dbooth-laptop>
Hi Greg & Andy,


Thanks for your comments.  It looks like Andy forgot to copy me, but I
found his email in the archives:
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2012Jun/0005.html

This is intended as a comment on the spec.  The comment boils down to
two points: 

1. I think it was a mistake to allow this ambiguity in the spec, because
fetching versus querying the graph store are very different operations,
and .  I recognize that sd:DereferencesURIs can be used to find out how
the SPARQL server behaves, but this would mean writing the query very
differently for servers that behave differently.  

- The LOAD operation already provides a means of fetching, so (now that
we have SPARQL Update) a second means of loading by use of FROM or FROM
NAMED is redundant.  

- If an implementation always tries to fetch, then this renders FROM
unusable when the user does not want to fetch.  This is unfortunate,
because FROM is a very convenient way to temporarily (for the current
query) specify that the query should use the merge of the graphs
specified in the FROM clauses. 

Regarding Andy's comment:

> The dataset description, FROM/FROM NAMED, is a declarative
> description of the dataset to be used for the query.  It does 
> not dictate how that dataset came about; that is an 
> implementation aspect.

That may be the way the spec currently views it, but I think that's a
mistake, because the whole point of standardization is to enable a user
to give the same query to two different implementations and get back
functionally equivalent results.  Leaving it ambiguous sounds like the
WG was unable to agree on which behavior should be considered correct,
and therefore allowed both, which is not good for users.

I guess another option would be to define a FETCH keyword that tells the
server to dereference the URI.

I also want to acknowledge that I think the situation has changed since
SPARQL 1.0 when SPARQL Update 1.1 had not yet been standardized.  Before
the LOAD operation had been standardized, I think there was rationale
for fetching was stronger.  Furthermore, SPARQL 1.1 Update now defines
the notion of a graph store, whereas did not have such a notion.

Bottom line: If the working group feels strongly that the train has
already left the station on this one then I am willing to gracefully
accept defeat.  But I do think it is important enough to bring it up for
more discussion.


2. If this ambiguity is to remain in the spec, then it would be helpful
to explicitly acknowledge it and mention sd:DereferencesURIs, so that
readers are not left puzzled about which interpretation is correct.  For
example, even a non-normative comment like this would help:
[[
This specification is intentionally ambiguous about whether the SPARQL
server should fetch data by dereferencing the given graph URI or it
should merely use the graph URI as the name of a graph that is expected
to already exist in the graph store.  The Service Description property
sd:DereferencesURIs can be used to determine which behavior a SPARQL
server supports.  Furthermore, some SPARQL servers allow this behavior
to be controlled by a configuration option.
]]


Thanks very much,
David


On Mon, 2012-06-18 at 22:00 -0400, Gregory Williams wrote:
> On Jun 2, 2012, at 11:42 PM, David Booth wrote:
> 
> > When a SPARQL query is issued to a server using the FROM <U> (or FROM
> > NAMED <U>) syntax, is the server supposed to fetch that graph from U or
> > is it supposed to only look among its existing named graphs for U?
> ...
> > These sections certainly make it sound like the server is supposed to
> > fetch from the URI (or perhaps use a cached version if it is fresh).
> > But I just tried this with three different SPARQL servers, and only one
> > fetches from the URI.  The others only look among their existing named
> > graphs.
> > 
> > There is a big difference between fetching and not fetching.  As a query
> > writer I need to know which behavior is correct.  
> 
> Hi David,
> 
> (This is a personal, and not a working-group-official, response.)
> 
> Beyond what Andy said in his response (regarding dataset construction
> being implementation specific), the Service Description vocabulary may
> provides some help in this situation. The sd:DereferencesURIs feature
> defined in the SD document[1] can be used to indicate that the
> implementation will dereference URIs used in the query when
> constructing the dataset.
> 
> Based in part on your comment, the working group is currently looking
> at the wording of that feature with regard to Update operations, but
> its definition/use should be considered stable regarding Query
> operations (as it was included based on the way existing
> implementations work).
> 
> thanks,
> .greg
> 
> [1] http://www.w3.org/TR/sparql11-service-description/#sd-dereferencesuris
> 
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Friday, 22 June 2012 16:24:46 UTC