Re: Using named graphs to model PROV's Accounts from Andy Seaborne on 2011-10-12 (public-rdf-prov@w3.org from October 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 12 Oct 2011 08:38:49 +0100
To: Timothy Lebo <lebot@rpi.edu>
CC: public-rdf-prov@w3.org
Message-ID: <4E954409.1090501@epimorphics.com>
On 12/10/11 05:43, Timothy Lebo wrote:
>
> On Oct 11, 2011, at 3:32 PM, Andy Seaborne wrote:
>
>>
>>
>> On 11/10/11 19:11, Timothy Lebo wrote:
>>> rdf-prov,
>>>
>>> In preparation for the RDF WG F2F this week, I wanted to provide some discussion on using named graphs to address some provenance modeling.
>>>
>>> I have updated http://www.w3.org/2011/prov/wiki/Using_named_graphs_to_model_Accounts to reflect some feedback and extend the discussion on named graphs.
>>>
>>> In particular, I discuss:
>>>
>>> * reuse of the SPARQL Service Description vocabulary to describe named graphs.
>>> * Meta Named Graph pairs,
>>> * a simple application of these to create Cache Graphs
>>> * the importance of modeling the "location" of a graph to disambiguate many graphs with the same name.
>>>
>>> These components are needed to model PROV's notion of Accounts, which permit different agents to assert different views of the same "event" (i.e., ProcessExecution). I hope to wrap up all of this into a final proposal by the end of the week.
>>>
>>> Any suggestions or comments appreciated.
>>
>
> Thanks for taking a look, Andy.
> I really appreciate your time and consideration.
>
>>
>> As a principle (of AWWW), one name can only refer to one thing.
>
> Absolutely. The problem that I'm trying to highlight is that traditional named graph modeling (from what I have seen) has been "lazy" in the choice of URIs used to name them, inadequately assuming a local scope when referencing them. This laziness _is_ violating the AWWW's only-one-referrent principle.

We agree then that inappropriate naming is a problem.  I think that the 
inappropriateness is as much whether the naming is "fit for purpose" 
(application purpose).  Reuse of information by someone else may be 
stretching or breaking the original purpose - result: bad naming.  The 
original naming scheme miay well have been adequate for the original 
publisher.

I tried to describe how an application that really wants to track 
changes might go about naming of the significant concepts: it does not 
rely on the publisher doing anything (Sandro has written up the version 
where the publisher publishes in a way that makes the state at a 
particular time explicit):

The write-up was rather rushed I'm afraid:

http://lists.w3.org/Archives/Public/public-rdf-wg/2011Oct/0148.html

(There is no new idea in the description - it's entirely other people's 
ideas written up badly)

To compare to N3: log:includes is the relationship of a location and its 
contents.  It's at a point in time, when the application rules run.  To 
capture the possibility of observations at different times, each 
observation generates a URI and makes claims about the observation.

	Andy

>
> On the practical side, there is some niceness to naming a local graph with the same name as a graph somewhere else. For example, it's pretty self-evident that  my g-box named<http://www.w3.org/People/Berners-Lee/card>  is going to have something to do with another g-box named<http://www.w3.org/People/Berners-Lee/card>, or even the g-snap you stumble upon when you resolve<http://www.w3.org/People/Berners-Lee/card>.
>
> But the straightforwardness of multiple g-boxes with the same name comes with a cost - eyesore URIs.
> Some notes on a generic approach to create unique URIs for the g-boxes are at [1], which proposes to tuck a DESCRIBE<NG>  into a SPARQL endpoint's namespace. Ugly, but general.
> (The writeup isn't as clear as I'd like to to be, and for that I apologize)
>
>>
>> "graph" here seems to refer to graph-a-location but also "graph the contents of the location".  But those are different things.
>
> I agree, they are different.
> Is there a better discussion of this distinction, so that I may reference it and reconcile my discussion?
>
>>
>> The RDF-WG has the concept of "graph box" (g-box) which is a thing that hold on "graph-value" (g-snap - snapshot).
>
> I'm happy to adopt this terminology, as it is quite intuitive.
>
>
>>
>> In RDF a graph is a set of triples and as a set it can not change. ("Set" as in the mathematical kind, not the programming language mutable datastructure).
>
>
> I'm not sure I've challenged what an RDF graph is; I'm targeting some challenges with the multiple aspects of what a _named_ graph is.
> Did I say something that conflicts with the notion of a "vanilla" RDF graph? If so, please let me know where so that I can smooth that out.
>
>
> Thanks again for your responses. Using your terminology really helps clarify things.
>
> Regards,
> Tim Lebo
>
>
>
> [1] https://github.com/timrdf/csv2rdf4lod-automation/wiki/Naming-sparql-service-description%27s-sd:NamedGraph
Received on Wednesday, 12 October 2011 07:39:27 UTC