W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2010

Re: md5sum and sha1sum functions

From: Steve Harris <steve.harris@garlik.com>
Date: Mon, 6 Dec 2010 12:44:25 +0000
Cc: Paul Gearon <gearon@ieee.org>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <30F6D440-C372-4AD9-9177-9C30F1B67C11@garlik.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
If these are going to return a simple literal containing hex characters, rather than some 128 / 160 / 256 bit integer datatype, then I'd prefer MD5_HEX() etc.

I marginally prefer named functions, e.g. SHA256_HEX(?x), rather than SHA_HEX(?x, 256). Length might not be enough to distinguish all algorithms on it own, so we could end up with some odd cases.

BTW, hexadecimal SHA1 values are 40 characters long.

- Steve

On 2010-12-06, at 12:25, Andy Seaborne wrote:

> I agree with Sandro that we should have sha1, sha224, sha256, sha384 and sha512.
> 
> Whether they are named or have a length parameters (for certain fixed values only), I don't much mind.  Does anyone want the ability to switch at runtime on a per-call basis? sha256(s) and sha(s, len) is also possible.
> 
> FYI: Apache common codec does not have sha224.  Searching, I find that sha224 is an addition of Feb 2004 and is a truncated SHA-2 256.
> 
> 
> On 03/12/10 23:04, Paul Gearon wrote:
>> As discussed in the last teleconf, I would like to propose the include
>> of an "md5sum" function, in a similar fashion to MySQL.
> 
> Fine tuning: Just MD5() and SHA1()?
> 
> md5sum is the name of a program that generates md5 checksums.
> 
> (I know FOAF uses mbox_sha1sum but it also has the experimental foaf:sha1 for documents).
> 
>> MD5SUM is often used for storing passwords. SHA1SUM is used in a
>> similar way, and is also used for hashing email addresses in FOAF.
>> 
>> ---
>> 
>> MD5SUM
>> 
>> The MD5SUM function accepts a single plain literal argument and
>> returns a simple literal containing a string of exactly 32 characters.
>> Each character represents a hexadecimal digit and is one of [0-9a-f].
> 
> Is plain literal the right choice here?
> 
> Either of
> 
>  simple literal
>  simple literal+xsd:string
> 
> make more sense to me
> 
> The case of plain+lang seems to me to be a bad choice as the checksum does not include the language tag.
> 
> 	Andy
> ...
> 
>> 
>> ?r
>> --
>> "f96b697d7cb7938d525a2f31aaf161d0"
> 
> ?r => ?m
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Monday, 6 December 2010 12:45:00 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:44 GMT