Re: Indirect Graph Identification

On Mon, 2011-11-28 at 13:55 -0500, James Leigh wrote:
> Hello,
> 
> Please consider using a more general, well-known, indirect graph
> identification[1] URI pattern.
> 
> The cases stated for requiring indirect graph identification are shared
> by the Linked Data community for indirect resource identification[2].
> Linked Data is often mirrored for the purposes of creating
> visualizations of the data, merging some or all of the data with data
> from other sources and/or enhancing responsiveness to queries.
> 
> For your convince, I have copied the indirect graph cases from
> rdf-update[1] below.
> 
>       * the naming authority associated with the URI of an RDF graph in
>         a Graph Store is not the same as the server managing the
>         identified RDF content
>       * the naming authority is not available
>       * the URI is not dereferencable (i.e., when dereferenced, it does
>         not produce a RDF graph representation)
> 
> Replacing "RDF graph" above with "RDF resource", the same cases are
> equally a challenge for managing RDF data in the Linked Data community.
> 
> I propose that this working group consider using a more general URI
> pattern that could be equally applied to both RDF graph storage and RDF
> resource resolution.
> 
> Such a general prefix should use a well known[3] path prefix to allow
> clients to infer the identified graph or resource without resolution.
> The request-URI below could be recognized by both clients and servers as
> identifying the graph with the identifier of
> "http://www.example.com/other/graph".
> 
>    GET /.well-known/alias;http%3A//www.example.com/other/graph HTTP/1.1
>    Host: example.com
>    Accept: application/rdf+xml

I'm a big fan of .well-known, but I don't really see how it helps us
here.

The situation here, it seems to me, is that SPARQL has created little
pocket universes where certain IRIs (the Graph IRIs) are used in close
proximity to RDF, tagging RDF Graph Containers, but where each IRI has
its own local meaning.      On a single host, you could have 1000 sparql
endpoints, each of which uses http://www.example.com/other/graph as the
label for entirely unrelated RDF Graph Containers!

With indirect identification, those graph containers each get nice,
normal, RDF IRIs again.  Let's say you, me, and David each have sparql
stores on the same host, and we all use that IRI to tag OUR OWN COPY of
some data fetched from that URL.   (I think it's a terrible practice,
but its clear from discussions in the RDF WG that people like it and we
can't stop it.)

So, if/when people engage in this unfortunate practice, everyone can
still refer to their data, as if they didn't, using indirect graph
identification, like this:

  http://www.example.com/sandro/sparql?graph=http%3A//www.example.com/other/graph
  http://www.example.com/james/sparql?graph=http%3A//www.example.com/other/graph
  http://www.example.com/david/sparql?graph=http%3A//www.example.com/other/graph

I guess part of what I'm saying is that this indirection is different,
because it's dealing with a different problem.  It's not dealing with
things being non-dereferenceable or whatever, it's dealing with the fact
that SPARQL endpoints have the right to make Graph IRIs mean whatever
they want, -- so they can only be understood/used by anyone else when
they've been paired with the endpoints own URL.   Indirect
identification does that pairing.

    -- Sandro

> Thanks,
> James
> 
> [1] http://www.w3.org/TR/sparql11-http-rdf-update/#indirect-graph-identification
> [2] http://www.w3.org/2011/09/LinkedData/ledp2011_submission_10.pdf
> [3] http://tools.ietf.org/html/rfc5785
> 
> 
> 
> 
> 
> 
> 

Received on Tuesday, 29 November 2011 05:12:38 UTC