Re: Graph labels vs. graph names. (was: Re: complete graphs)

On 5 Oct 2011, at 17:50, Sandro Hawke wrote:

> On Wed, 2011-10-05 at 15:54 +0100, Steve Harris wrote:
>> On 5 Oct 2011, at 14:21, Sandro Hawke wrote:
>> 
>>> On Wed, 2011-10-05 at 13:22 +0100, Richard Cyganiak wrote:
>>>> 
>>>> Implementers and users of SPARQL seem to be generally perfectly ok
>>>> with relying on private conventions. 
>>> 
>>> What sort of private conventions have you seen?    I've heard people
>>> talk about:
>>> 
>>> 1. graph tag is the URL they once fetched the graph from
>>> 2. graph tag is the URL on which they publish the graph
>>> 3. graph tag is some sort of non-dereferenceable genid
>>> 4. graph tag is primary subject URI of the graph (eg the person, for
>>> FOAF)
>> 
>> In Garlik we use all of these, and also use "graph tag is the URI we derived data from, appended with the ISO date as a fragment identifier "— when mining data from web pages, PDF docs etc.
>> 
>> e.g. <http://plugin.org.uk/#2011-10-05>
>> 
>> This allows us to track changes over time, which would otherwise be difficult.
> 
> Where "difficult" means you'd need to handle another level of
> indirection, right?   That's a complexity hit and a performance hit.
> Doable, but ... so far ... you don't see any reason to do it.

Correct.

There's also the issue of human-friendlyiness, we could name the graphs like <http://foo.example/0c32f6b0-ef7a-11e0-be50-0800200c9a66>, but there's no real benefit.

> That is, you could always make up a new type-2 or type-3 IRI to use as
> the graph tag, and maintain a mapping between that and whatever else you
> might have wanted to use as a graph tag.
> 
>>> It seems to me the variation here is an impediment to interoperability.
>>> If my code talks to a new sparql server, and doesn't know which of these
>>> conventions is being used, how can it do its job?   (Feel free to
>>> replace "talks to a new sparql server" with "fetches a TriG document",
>>> etc.)
>> 
>> I don't understand the rationale for wanting to normalise behaviour here. We don't mandate a particular structure for subject URIs, for example. In FOAF files I can use any URI I feel like to describe people, e.g. <#me>, <#i>, <http://alice.example/>. Doesn't seem to cause any significant interoperability problems.
> 
> This brings us back to use cases.   Is there something we want to
> achieve with some additional interoperation?
> 
> One thing I'd like is to be able to follow some information as it moves
> through various systems.  Provenance; an audit trail.   Where was it
> fetched from, how has it been processed, where does it end up.  Do you
> guys do anything like that with your data?   Do graph IDs play a role?

Yes, inside the <http://plugin.org.uk/#2011-10-05> graph, we have some triples with the subject <http://plugin.org.uk/#2011-10-05> which say a little about when, how, why the graph was generated.

This is distinct from triples with the subject <http://plugin.org.uk/>, which describe some things about the page, which were true at the time.

Nothing terribly sophisticated though, and I'm not sure if the distinction is really important for our use case.

- Steve

Received on Wednesday, 5 October 2011 17:49:20 UTC