Re: Towards PROV-O Accounts from Timothy Lebo on 2012-01-05 (public-prov-wg@w3.org from January 2012)

From: Timothy Lebo <lebot@rpi.edu>
Date: Thu, 5 Jan 2012 10:58:07 -0500
To: Graham Klyne <GK@ninebynine.org>
Cc: Provenance Working Group WG <public-prov-wg@w3.org>
Message-Id: <9824F220-75A4-4D98-88D7-6D4F40CE1FE9@rpi.edu>
On Jan 5, 2012, at 3:34 AM, Graham Klyne wrote:

> Hi Tim,
> 
> I took a quick look at this (your [1]), and I was OK with the basic structure used, but I'm not understanding why there is so much focus on a name for the abstract triples as opposed to a user-supplied name.

The emphasis on the name of the abstract triples is needed to be clear about what, exactly, is being named. 
The file path (i.e., location) is not being named, nor is the serialization.
The content is being named. **Users _may_ choose whatever name they wish** -- as long as they are naming the content and not its more concrete manifestations.
My discussion used an RDF graph hash to name the content -- merely to be clear about what I'm naming. Though, it would be a good Best Practice to name with hashes.

As Sandro mentioned in a later email in this thread, there is much confusion about the distinction among GraphContainer, (RDF Abstract) Graph, and GraphSerialization.

> 
> I'm guessing this may be related to a similar issue with digital signatures over RDG graphs.

I must admit, I'm not familiar with the RDF graph digital signature work.
If you could point me to something that I should be considering, please let me know.
I am concerned that if accounts are described at a concrete level (location, serialization), then it will become overwhelmingly difficult for a consumer to use it without requiring provenance of the account assertion file floating through the web.

For example, you say in /srv/account.rdf that you promise the stuff in /srv/claims.rdf is true.
A triple in /srv/claims.rdf changes and gets published to http://graham.name/promises/iswear.rdf

I need to know what you know, so I grab your account assertion, which mentions "/srv/claims.rdf". 
How does that relate to /srv/stuff-from-graham/iswear.ttl?

I could troll through all of that publishing and retrieval provenance (which, btw, needs infrastructure).
OR I could see that /srv/stuff-from-graham/iswear.ttl conveys the same RDF Abstract Graph as the content asserted in the account back across the pond in /srv/account.rdf.

(or did I just become a poster child for the graph digital signature community :-/ )


>  There has been work to apply such signatures to some canonicalization or abstraction of the graph, but I don't see the necessity.  In the real world, when one signs a document, one signs a *particular rendering* of the document, and said signature can be used as evidence for agreement to the abstract content of same.

Absolutely.

> 
> I see something similar applying to account graph assertions:  if a user asserts an account graph, they assert a *particular instance* (or maybe several) of that graph.  If one trusts that user, then one may license inferences based on the abstract content of the graph,

If one trusts that user *and* one trusts that the graph they are querying is the graph asserted by those they trust.
In most applications, there will be a gap between those two. And I am scared that what happens in that gap is fully trusted.


> and by extension inferences based on semantically equivalent graph instances, but that's a separate issue IMO.
> 
> Why do I care about this?  I think that the essential nature of using named graphs to control the scope of what provenance accounts are actually being asserted (or treated as asserted for some purposes of provenance analysis) is confused and muddied by the discussion of different graph instances and abstract graph content.


If we can assume that we're not using the web, and we only use our system, then I agree with you.
The current use of named graphs  -- the writeup I think you're asking for -- does not account for provenance, beyond the poor convention that the sd:name cites the URL from which its content was received.

I think it's our job as prov-wg to address these more open and dynamic situations.

-Tim



> 
> #g
> --
> 
> PS: I don't know if it's at all relevant, but I made some personal notes a long time ago about issues around using contexts for scoping assertions:
> 
>  http://www.ninebynine.org/RDFNotes/UsingContextsWithRDF.html
> 
> (It's kind of dated now; I use the term "formulae", from Notation3, to mean roughly what we mean by named graphs.)
> 
> 
> On 05/01/2012 03:35, Timothy Lebo wrote:
>> prov-wg,
>> 
>> I have been working on some discussion [1] that is relevant to modeling Accounts in PROV-O.
>> 
>> It is incomplete, but I think ready for some initial feedback.
>> 
>> Modeling accounts is on the agenda for tomorrow's telecon [2], so I hope this can provide some discussion material.
>> 
>> Regards,
>> Tim
>> 
>> [1] http://www.w3.org/2011/prov/wiki/Using_graphs_to_model_Accounts
>> [2] http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.01.05
>> 
>> 
>> 
>
Received on Thursday, 5 January 2012 15:58:40 UTC