Re: Fwd: Protocol extensions for federated querying

Paul Gearon wrote:
> Hi everyone,
> 
> Accidentally sent this to the sparql-dev list. Forwarded to the
> correct place now with apologies....
> 
> This meets the commitment I made for ACTION-124.
> 
> So far, all the comments I've seen on federated queries have been
> about the suggested query syntax. To date I'm in agreement with what
> I've seen proposed.
> 
> I am also interested in extending the protocol to support federation a
> little better. At the moment, all queries are done as a simple request
> via a GET or a POST. In the case of POST, the endpoint alone is
> provided in the URL, and the query appears in the body.
> 
> I'd like to see a form of POST that includes a SPARQL variable binding
> result in the body (a la http://www.w3.org/TR/rdf-sparql-XMLres/). In
> this way the receiving query engine can work with prebindings that are
> provided to it, allowing it to reduce the result that is to be
> streamed back to the calling engine.
> 
> To give an example, I'll reference the two datasets found in 8.3 of
> the SPARQL Query Language document:
>  http://www.w3.org/TR/rdf-sparql-query/#queryDataset
> 
> If we make the presumption that the named graph
> http://example.org/foaf/aliceFoaf can be found at
> http://sparql.org/sparql/, then I might want to issue the following
> query to get the names of people whose nicknames are in the bobFoaf
> graph:
> 
> SELECT ?nick ?name
> FROM <http://example.org/foaf/bobFoaf>
> WHERE {
>  ?p1 foaf:nick ?nick .
>  ?p1 foaf:mbox ?mbox
>  SERVICE <http://sparql.org/sparql/> {
>    SELECT ?mbox ?name
>    FROM <http://example.org/foaf/aliceFoaf>
>    WHERE { ?p2 foaf:mbox ?mbox . ?p2 foaf:name ?name }
>  }
> }
> 
> 
> The part of the query in the SERVICE block would usually return the following:
> <?xml version="1.0"?>
> <sparql xmlns="http://www.w3.org/2005/sparql-results#">
>  <head>
>    <variable name="mbox"/>
>    <variable name="name"/>
>  </head>
>  <results>
>    <result>
>      <binding name="mbox"><uri>mailto:alice@work.example</uri></binding>
>      <binding name="name"><literal>Alice</literal></binding>
>    </result>
>    <result>
>      <binding name="mbox"><uri>mailto:bob@work.example</uri></binding>
>      <binding name="name"><literal>Bob</literal></binding>
>    </result>
>  </results>
> </sparql>
> 
> Note that this is information for both Bob and Alice. This can then be
> joined to the remainder of the query, which reduces the results to
> just Bob.
> 
> However, a query engine may instead want to evaluate Bob first. This
> may be desirable if some COUNT queries have already been issued, and
> the query engine knows that the results of the SERVICE block will
> return a large number of results, while the local data would bind
> ?mbox to only a few values. In that case, the local binding of ?mbox
> could be sent along with the query (?p1 and ?nick are not necessary
> for the remote service). This could be accomplished using a POST that
> has the query in the URL, and the bindings in the body.
> 
> POST /sparql/?query=SELECT+%3Fmbox+%3Fname+FROM+%3Chttp%3A%2F%2Fexample.org%2Ffoaf%2FaliceFoaf%3E+WHERE+%7B+%3Fp2+foaf%3Ambox+%3Fmbox+.+%3Fp2+foaf%3Aname+%3Fname+%7D
> HTTP/1.1
> Content-Length: xxxxxx
> Content-Type: multipart/form-data; boundary=ZpwZZc62ZXXjf0InvlrBjTWNrJSp--FL
> Host: sparql.org
> Connection: Keep-Alive
> User-Agent: example
> 
> --ZpwZZc62ZXXjf0InvlrBjTWNrJSp--FL
> Content-Disposition: form-data; name="query-prebinding"
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> <?xml version="1.0"?>
> <sparql xmlns="http://www.w3.org/2005/sparql-results#">
>  <head>
>    <variable name="mbox"/>
>  </head>
>  <results>
>    <result>
>      <binding name="mbox"><uri>mailto:bob@work.example</uri></binding>
>    </result>
>  </results>
> </sparql>
> 
> --ZpwZZc62ZXXjf0InvlrBjTWNrJSp--FL--
> 
> With this pre-binding, the remote query engine is able to reduce it's
> results to just the one for Bob, thereby cutting the returned size
> down by nearly half.
> 
> One potential issue is for very long queries that also want to be
> placed into the body of a POST. In that case we could simply define
> the names of each section (in the example above I've used a name of
> "query-prebinding").
> 
> What do others think? Does this proposal have merit?

Hi Paul,

Do any systems that you know of implement this extension?

I'm wary to undertake significant changes that weren't necessarily part 
of our deliverables-gathering phase early on. That said, I say 
"necessarily" since of course this might be seen by the group as an 
essential part of "basic federated query", in which case it does indeed 
fit within our charter.

Lee

Received on Wednesday, 21 October 2009 13:09:34 UTC