SPARQLProxy

[cc www-archive]

We talked about this the other day; I don't think there was a
conclusion.

I want a way to find one or more SPARLQ end points who can answer
queries about some RDF graph, for when the graph is very large.  Folks
providing large graphs may choose to provide this service; folks using
such graphs may choose to use it.  If they both use it, and the graph is
large, there may be significant performance gains.

I propose the following practice:

   1.  Optionally, provide an HTTP header (in response to GET and HEAD):

         SPARQL-Proxy: ...url-of-endpoint...

       Clients may choose to do a HEAD first, or to do the GET and
       simply abort the connection and switch to using SPARQL when they
       notice this header.  The choice to switch to SPARQL might be
       based in part on the value of the Content-Length header, if
       present.

       (I don't remember the current political climate around making up
       HTTP headers like that.  My sense is it's okay.)

   2.  Optionally, near the begining of an RDF graph, include a triple:
 
         <> ns:SPARQLProxy <...url-of-endpoint...>

       This functions just like the HTTP header on a GET request; you're
       getting the graph transmitted to you, but if it's big, you'll see
       this before you get it all, and then you can abort the
       connection.

The idea here is that the named endpoint would have this graph (called
<> above) as one of its named graphs.  It might or might not be the
background graph.

In style-2, there might be some metadata about the endpoint BEFORE this
triple, such as how to turn on inference, or something.  I'm not trying
to address that here.

I think this is a very simple and harmless approach, and meets my needs
well enough.  Do you see any problem with it?  Any suggestions for the
namespace to use?

     -- Sandro

Received on Saturday, 1 August 2009 19:27:47 UTC