- From: Sergey Melnik <melnik@db.stanford.edu>
- Date: Mon, 06 Nov 2000 15:52:07 -0800
- To: Steve Dunham <dunham@cse.msu.edu>
- CC: www-rdf-interest@w3.org
Steve, thanks for your suggestion. I'll most probably include it into the next distribution of the API. One of the desirable properties that I wanted a model digest algorithm to have is easy recomputation of the digest whenever the content of the model changes. This property is not satisfied when using your suggestion, but it is not absolutely essential. I'm thinking of whether the digest algorithm can be made more efficient. Currently, statement digests are computed as d1 = SHA1(s) d2 = SHA1(p) d3 = SHA1(o) if(o instanceof Literal) rotate left d3 by 8 bits statement_digest = SHA1( concat(d1, d2, d3) ) That is, in worst case, computation of a model digest involves 4 applications of SHA1 to every statement (2-3 on average), which is expensive. Maybe one SHA1 call per model is sufficient. One could concatenate resource URIs/literals in some robust way... Any thoughts on that? Sergey Steve Dunham wrote: > > I was reading your page on rdf digests[1], which says that you're > using a XOR of statement digests as a model digest, and that it isn't > secure. (For fairly obvious reasons.) And it says the digest is still > under construction. > > For what it's worth, one way to do a reasonably secure hash is to take > a SHA1 of the concatenation of a sorted list of the statement hashes. > > That's: SHA1( concat( sort( statment_hashes ))) > > It's the first thing that comes to mind. I'm sure there are other > solutions. (I'm assuming that the only constraint is that the hash > is independent of statement order.) > > Steve > dunham@cse.msu.edu > (CC responses to me) > > [1] http://www-db.stanford.edu/~melnik/rdf/api.html
Received on Monday, 6 November 2000 18:34:44 UTC