Re: FROM and FROM NAMED: To fetch or not to fetch? from Paul Gearon on 2012-06-22 (public-rdf-dawg-comments@w3.org from June 2012)

From: Paul Gearon <gearon@ieee.org>
Date: Fri, 22 Jun 2012 14:24:50 -0400
To: David Booth <david@dbooth.org>
Cc: Gregory Williams <greg@evilfunhouse.com>, public-rdf-dawg-comments <public-rdf-dawg-comments@w3.org>
Message-ID: <CAGZNPFn_N7L75=zZ8kq-+Y1yHwsWmQ4y2aa3Tu1K03xFwOYGbg@mail.gmail.com>

On Fri, Jun 22, 2012 at 1:43 PM, David Booth <david@dbooth.org> wrote:
> Hi Greg,
>
> On Fri, 2012-06-22 at 13:12 -0400, Gregory Williams wrote:
>> David,
>>
>> I suspect most of this will be addressed in a formal response, but I
>> wanted to briefly comment on this:

Similarly, I just wanted to comment...

>> On Jun 22, 2012, at 12:24 PM, David Booth wrote:
>>
>> > - The LOAD operation already provides a means of fetching, so (now that
>> > we have SPARQL Update) a second means of loading by use of FROM or FROM
>> > NAMED is redundant.
>>
>> I don't think LOAD makes a dereferencing FROM redundant at all. For
>> example, an implementation can do dereferencing to provide a
>> general-purpose query service (e.g. sparql.org) without providing (or
>> implementing) Update or any sort of persistent graph store. Even if
>> such a system *did* implement Update and have a persistent graph
>> store, though, a general purpose query service would be very
>> cumbersome for users as it would require two separate protocol
>> requests: one to load the data, and one to query the data (and assumes
>> nobody has dropped, cleared, or updated your data in-between the load
>> and the query).
>
> Yes, I guess that is a different use case, for which fetching behavior
> is quite handy.  But I still think the issue comes down to the fact that
> fetching is very different from not fetching, and the two behaviors
> should not share the same ambiguous syntactic directive.

I'm with Greg as well. In particular, a LOAD involves storing and
potentially indexing the data locally, which I may not want. Other
than using resources on the server, there is also the case of
dynamically generated RDF that may change over time. Also, many
services provide public query access and admin only update access,
meaning that I can't explicitly fetch the data even if I want to.

> For example, I could imagine that use case being handled by a FETCH
> keyword rather than a FROM keyword.

I quite like this proposal.

Right now Mulgara looks at FROM (or GRAPH, or USING, etc) and checks
if the URI names a graph that is stored locally. If so, then that's
the graph used. If not, then it fetches the data dynamically. While
users have been happy with this, there are cases where it doesn't work
well. For instance, it is difficult to fetch a graph over HTTP if
there is a graph stored with the same URI (could always use the IP
address, but that's messy and is too much work for the user). It also
leads to different behavior depending on the current state of the
system. (e.g. I thought I was fetching from a dynamic source, but
someone loaded it locally and now I'm silently getting the local
static data). The other issue is that this hybrid behavior isn't
described by service descriptions.

FETCH would allow Mulgara to support both approaches while making it
explicit what is happening.

Unfortunately, it's very late in the SPARQL 1.1 process, so we may
have missed the boat. If so, then it would be worth getting it onto
the SPARQL 1.2 agenda.

Regards,
Paul Gearon

Received on Friday, 22 June 2012 18:25:20 UTC