W3C home > Mailing lists > Public > www-rdf-interest@w3.org > December 1999

Is MD5 based URIs unique?

From: Jonas Liljegren <jonas@paranormal.o.se>
Date: Sat, 18 Dec 1999 13:03:59 +0100
Message-ID: <385B782F.9544A107@paranormal.o.se>
To: Sergey Melnik <melnik@DB.Stanford.EDU>, RDF Intrest Group <www-rdf-interest@w3.org>
Sergey Melnik wrote:
> 
> In the implementation of an RDF API [1] the following algorithm is used:
...
> whereas d(x) is the MD5 (128 bit ;) hash of x.

The MD5 solution does solve the problem of giving equal URI for equal
triples. And the URI's not too long.

But I fon't like that they are not guaranteed to be unique.


It's one thing to use a MD5 digest as a checksum. In those cases, you
would use the digest in pair with another unique identifier, like the
file name (and origin URI). It's like how you use the passwords. The
passwords doesn't have to be unique. But that is because they are
always used with a username that is guaranteed to be unique.


There is a greate difference between using a checksum for a control
for just one object, and to trust it to be globaly unique.

Those genereated URIs would have to be unique in the whole of
internet.  It doesn't feel right that you would have to take a chanse
that you COULD mix up two diffrent triples.


I could be wrong. But a 128 bit number would be something between 1
and 340,000,000,000,000,000,000,000,000,000,000,000,000.  That seems
much today. But what about twenty years from now? What if everything
would bee in RDF triples? The golbal net could total up to very large
number. Maby a net of 1e9 devices with 1e15 triples each. Maby much
more.

What if every machine on the planet would depend on that there will
never be a mix up of triples? With 340 billions comparsions per
second, there could be one mix up every second.

Now, even if there is a very small possability of a mixup. Would we
still take it, if it was the matter of nuclear power plant or space
shuttle or maby some AI taking a decision on the type of surgery for a
patient.



If 128 bits seems over and well beyond what is required today. Maby we
should use 1024 bits? Maby there are a better solution?


-- 
/ Jonas  -  http://paranormal.o.se/perl/proj/rdf/schema_editor/
Received on Saturday, 18 December 1999 07:04:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:51:42 GMT