- From: William Waites <ww@styx.org>
- Date: Thu, 23 Sep 2010 12:44:10 +0100
- To: public-xg-prov@w3.org
- Message-ID: <4C9B3D8A.30409@styx.org>
Hi all, I was looking at the list archives and I found a discussion back in December [0] about where access time belongs. There seemed to be some argument that because the access/retrieval operation ostensibly did not change the data, it shouldn't be treated as a Process in the OPMV sense. Recently, Ed Summers was doing some analysis and making some pretty pictures from the billion triple challenge data [1] and as it turned out the data he retrieved was corrupted. Worse this wasn't noticed until later, after he had already transformed and summarised it (e.g. applying further processes). In this context I would like to make these points, * Even where the process is expected to make an identity transformation (e.g. no change), it might actually enexpectedly change the data. * The access *time* doesn't seem as relevant here as a checksum or hash would be. * Even if the retrieval is successful, with no errors, I think it still makes sense to treat it as a process that makes an identity transformation, if only to have a place to record things like hashes and access times and to confirm that it happened successfully. The idea of a checksum or hash might be useful as well in Requirement 1.1 - constructing a handle or token that represents a piece of data. If this were done simply, i.e. just hash(data) it would mean that if two people had the same data they would know it by comparing the hashes. Is this approach feasible? Harder, but potentially interesting, is where the data in question is RDF. Can we come up with a serialisation independent algorithm for representing a graph? Or put another way, can we find a way to treat graph equivalence via a function such that f(g1) = f(g2) iff g1 and g2 are equivalent? Some transformation to ground form, imposing of lexical ordering when serialising, treating blank nodes and variables specially? (further question, what about nested graphs?) Cheers, -w [0] http://lists.w3.org/Archives/Public/public-xg-prov/2009Dec/0002.html [1] http://lists.w3.org/Archives/Public/public-esw-thes/2010Sep/0009.html -- William Waites <ww@styx.org> Mob: +44 789 798 9965 Fax: +44 131 464 4948 CD70 0498 8AE4 36EA 1CD7 281C 427A 3F36 2130 E9F5
Received on Thursday, 23 September 2010 11:45:59 UTC