Re: [Fwd: FROM keyword unnecessary?] from Kendall Clark on 2004-09-30 (public-rdf-dawg@w3.org from July to September 2004)

From: Kendall Clark <kendall@monkeyfist.com>
Date: Thu, 30 Sep 2004 09:40:46 -0400
To: "Seaborne, Andy" <andy.seaborne@hp.com>
Cc: kendall@monkeyfist.com, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <20040930134046.GG25264@monkeyfist.com>
On Thu, Sep 30, 2004 at 02:10:31PM +0100, Seaborne, Andy wrote:
> 
> Kendall,
> 
> Excellent discussion of the issues.  I found this very helpful in 
> explaining the design space.

Thanks, Andy.

Some replies below:

> I came up with 3 use models:
> 
> 1/ Protocol (HTTP/SOAP)
> 2/ Local query from an application program
> 3/ Queries as scripts in files
> 
> You cover 1/ below.  I'd note that URLs encode the query so there is a 
> single thing that can be passed around with query and target but %-encoded 
> URLs really are opaque and uneditable.

Yes, that's correct. I've mostly been thinking about (1), not so much
about (2) and (3).

I'll also point out that encoding queries in a URL is one way to pass
them around. I think there are some other ways that are interesting
and that we should take account of. I'll have more to say about them
in a few days.

I did some experiments to see what our URLs would look like and you
are totally correct: they are hideously ugly and opaque and
un-human-friendly in extremis. Python:

print cgi.parse_qs(U)["q"][0]
PREFIX rdf: <http://foo.bar.com/#baz>
 SELECT ?a, ?b
 WHERE [(?a,  rdf:type, foo)]

print U
'q=PREFIX+rdf%3A+%3Chttp%3A%2F%2Ffoo.bar.com%2F%23baz%3E%0A+SELECT+%3Fa%2C+%3Fb%0A+WHERE+%5B%28%3Fa%2C++rdf%3Atype%2C+foo%29%5D&from=http%3A%2F%2Ffoo.bar+http%3A%2F%2Fbaz.bap%2FNumb%2Bnuts+http%3A%2F%2Fnono%23blither%3Bblap'

Scary. :>

> For 2/, its is Phil's argument that FROM is not needed which is possible.  

Yes, for (2), I think the target query selection can be pushed into
app-specific API.

> 3/ is an argument for FROM because it makes a query self contained.
> Sometimes the script will include the FROM, sometimes it won't (i.e. be 
> reusable on different targets).  Having a query+FROM in a file is 
> convenient for maintenance, but its just the convenience of the pairing.

I haven't thought at all about this use case.

> > 2. a query against more than one model, routed to a query processor
> >    service. In effect such a message says, "hey, query service, apply
> >    this query to these models", except that the model selection isn't
> >    include *in* the query proper.
> 
> I'm not sure we can be agnostic because of services that offer to query one 
> of several graphs.  I think we have to decide "service-centric" or 
> "graph-centric" for consistency in talking and writing about SPARQL.

Hmm, I disagree. I have sketched out an abstract protocol with 6
orthogonal methods: we can specify and require all 6 or any mixture of
the 6 with the rest being optional or not specified at all. Of these
6, I'm sure that 5 of them can be concretely realized as being applied
either to a graph URI or to a query processor service URI. I think the
6th can be, too, but I need to think more about it.

In the concrete, HTTP, part of the protocol, I've sketched out HTTP
operations for 5 of the 6 methods, and for each method, I've got an
HTTP operation for the method against a service URI and an operation
for it against a graph URI.

So, in the strictest sense, we don't have to choose, IMO. And given
that I intend to formalize all of this using WSDL 2.0, we really will
have a clean way of talking about *abstract* protocol operations in a
way completely independently of how they are concretely implemented.

(At least, that's my goal, and I'm about 50% of the way there. Don't
mean to be coy, but I'd like to have a decent, coherent draft to show
folks before saying anything else. Promise I won't tease like this
again! :>)

Not having to make this choice has certain benefits, including not
constraining the URIspace of implementations. I think that's a huge
win. Standardizing graph or service-centric views is just too much a
constraint on existing and future implementations, especially if we
don't have to choose.

> > 2. The really degenerate case, IMO, is a query against multiple
> >    models which is routed to a model URI. This, in effect, treats a
> >    URI identifying a model resource as if it were a URI identifying a
> >    service. This case really offends me. :>
> 
> Offends me too.

Good; glad we agree about that. :>

> >The simplest semantic for queries where some models are identified in
> >a FROM clause and other models are identified in the protocol is to an
> >additive semantic; that is, you gather all the models identified and
> >treat them all as query targets.
> 
> Protocol-overriding-query is also a possibility.

Yes. The possibilities seem to be:

(1) additive -- all graphs identified in the protocol or in the query
    are targets of/for the query

(2) query overrides protocol

(3) protocol overrides query

> >6. Multi model queries against a model:
> >
> >   GET /model?<query> + identification of other models
> >
> >   Illegal. I would disallow this one because it, in effect, is a
> >   confused case of (5) and because I think it doesn't make sense.
> 
> I agreed - this should be illegal but then I think we have to decide on 
> service-centric vs model-centric.

Hmm, those seem to be totally orthogonal issues, IMO.

> A possibility is not having FROM in protocol queries (or, better, ignoring 
> them) but keeping for local use if the local query processor wishes to 
> provide the facility.

Yes, that's possible. But might it confuse users?

> Implementation experience:
> 
> - Joseki ignores FROM.  Multiple FROM's don't make sense.  I'd assumed
>   that aggregations are important enough to have their own URI.

I agree in general that aggregations are that important; but this
ignores ad hoc queries against the 3 new FOAF graphs my bluetooth
phone *just* discovered, which I want to query as an ad hoc
aggregation. Edd Dumbill has code for discovering FOAF over Bluetooth,
so this is totally a right-now use case. It also ignores ad hoc
aggregations over 3rd party graphs I don't really control; say, the
only thing I can do is GET them. We have a use case in the UC&R for
exactly this case.

Cheers, Kendall
Received on Thursday, 30 September 2004 13:43:14 UTC