# Security of Digest Algorithm

From: McBride, Brian <bwm@hplb.hpl.hp.com>
Date: Wed, 10 May 2000 15:32:53 +0100
Message-ID: <5E13A1874524D411A876006008CD059F239227@0-mail-1.hpl.hp.com>
To: "RDF Interest (E-mail)" <www-rdf-interest@w3.org>
```I said recently I'd try to get some help on the security of the proposed
model digest algorithm.

I'm indebted to one of my colleagues in hut 5 who was the first, of several
that I asked,  to point out a solution to this question.

The bottom line is it's not secure. For a digest of k bits, given any set S
of k+1 statements, it is simple to compute a non-empty subset of S whose
digests will XOR to zero.  This would enable them to be added to a model
without changing its digest.

My mathematical colleague says:

Suppose we are given a set of n binary vectors of length k, where n
> k.
Then there are least n-k non-empty subsets of the vectors whose
combined
XOR is 0, and they can be found efficiently using a modification of
Gaussian elimination as described on p.425 of [*]. In particular,
amongst
any set of 161 binary vectors of length 160 this algorithm will find
a
non-empty subset whose combined XOR is 0.

[*] D.E. Knuth, Seminumerical Algorithms, The Art of Computer
Programming

Here is an informal explanation of why.  We'll do this for 160 bit digests,
since SHA-1 generates digests of that size.  If you have 161 statements, you
can think of this as a set of 160 linear equations in 161 variables.
Because there are more variables than equations, there has to be at least
one non-trivial solution.  That solution can be found by solving the set
of the equations by a variation on the technique we all learned in high
school.  Here's how to construct the equations:

Pick a random set of 161 statements S = {s0, s1, ... s160} and compute the
set of their digests D = {d0, d1, ... d160}.  We want to find a subset of S
whose digests XOR to 0 i.e. we want to find bits x0, x1, ... x160 such that:

d0&x0 ^ d1&x1 ^ d2&x2 ... ^ d160&x160 = 0

i.e. xi = 1 if s1 is in the subset and 0 otherwise.  Thinking just about the
bit 0 column of each of the di, then each bit, anded with the appropriate
xi, must XOR to zero.  More formally, with d0:0 is the 0 bit of d0 etc,
then:

d0:0&x0 ^ d1:0&x1 ^ d2:0&x2 ... ^ d160:0&x160 = 0

The same is true of the bit 1 column, the bit 2 column etc up to the last
column, so we have a system of simultanious equations in the xi:

d0:0&x0 ^ d1:0&x1 ^ d2:0&x2 ... ^ d160:0&x160 = 0
d0:1&x0 ^ d1:1&x1 ^ d2:1&x2 ... ^ d160:1&x160 = 0
...
...
d0:159&x0 ^ d1:159&x1 ... ^ d160:159&x160 = 0

Brian McBride (with a lot of help)
```
Received on Wednesday, 10 May 2000 10:33:03 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:29 UTC