Re: Is MD5 based URIs unique? from Sergey Melnik on 1999-12-19 (www-rdf-interest@w3.org from December 1999)

From: Sergey Melnik <melnik@DB.Stanford.EDU>
Date: Sat, 18 Dec 1999 20:08:39 -0800
To: Jonas Liljegren <jonas@paranormal.o.se>
CC: RDF Interest Group <www-rdf-interest@w3.org>
Message-ID: <385C5A47.2F70825E@db.stanford.edu>

Jonas Liljegren wrote:

> ...
> If 128 bits seems over and well beyond what is required today. Maby we
> should use 1024 bits? Maby there are a better solution?

I think it is not possible to find a finite number N of bits to satisfy
a user who is paranoid about the security. For practical reasons, N
grows as technology evolves.

So the only hope is that we will be able to smoothly migrate from N to
N+M over time.

I thought MD5 was a good starting point given the convenient number of
bits (128 = 64*2). However, you are right in that we should not start
with a corrupt technology: it is known how to algorithmically "crack"
MD5. This is not true for SHA-1 which uses 160 = 64*2 + 32 bit. Taking
in consideration your long-term concerns, it makes sense to start with
SHA-1 (algorithm implementation is included in the standard Java
distribution). I modified the API distribution [1] to make SHA-1 the
standard algorithm.

The current hash-based approach has two features:

(1) The set of algorithms that are currently used to generate hashes for
anonymous resources, reified statements and models are independend of
the underlying hashing algorithm (and number of bits).
(2) The oneway hashes make no sense without the underlying data.

Thus, it will always be possible to recompute the digests upon a
transition to a new algorithm.

Sergey

[1] http://www-db.stanford.edu/~melnik/rdf/api.html

Received on Saturday, 18 December 1999 23:02:47 UTC