- From: Tim Berners-Lee <timbl@w3.org>
- Date: Fri, 30 Nov 2001 20:09:04 -0500
- To: "Sean B. Palmer" <sean@mysterylights.com>, "Dan Connolly" <connolly@w3.org>
- Cc: <www-archive@w3.org>
Woooo! Yeah, spb! Thank you! I haven't looked at it yet (distractions yesterday), but comments on your mail follow. Tim ----- Original Message ----- From: "Sean B. Palmer" <sean@mysterylights.com> To: "Tim Berners-Lee" <timbl@w3.org>; "Dan Connolly" <connolly@w3.org> Cc: <www-archive@w3.org> Sent: Thursday, November 29, 2001 8:25 PM Subject: Cryptography In CWM: Hashes > Summary: the beginnings of a cryptography module for CWM, as > "cwm_crypto.py", with hash-finding built-ins. > > I wanted to do signature validation before I shipped this off, but > since I have to download several hundred packages to do that, I Several hundred? I hope that is an exaggeration. Which crypto package are you using, I wonder? > thought I'd just archive this first. Good move. > As a test of the module, I used crypto.n3:- > > python cwm.py crypto.n3 -think -purge > crypto-out.n3 > > and the file "test.txt", which simply contains the string "blargh". > The result is:- > > [[[ > @prefix : <#> . > @prefix crypto: <http://www.w3.org/2000/10/swap/crypto#> . > @prefix log: <http://www.w3.org/2000/10/swap/log#> . > @prefix string: <http://www.w3.org/2000/10/swap/string#> . > > <file:c:\test.txt> a :GetHashFile; > :content "blargh"; > :md5 "ef15c9bd4c7836612b1567f4c8396726"; > :sha "d1e670385f40ee942a059f949c761214872ac35f" . > ]]] - <<crypto-out.n3>> > > The files are attached. The most important is <<cwm_crypto.py>>, which > is the actual module as it stands. I also needed to modify llyn.py to > register the built-ins, so that is attached too, as <<llyn.py>>. > <<crypto.n3>> is the test file, and <<crypto-out.n3>> is the output. > Also attached is the simple <<test.txt>> "blargh" test file. The paths > should be modified appropriately. cwm_crypto.py was based on one of > the other built-ins modules: cwm_string.py. Great > The properties that one can use at the moment are:- > > crypto:md5 a rdf:Property; rdfs:label "md5"; > rdfs:comment "The MD5 hash of a string"; > rdfs:domain string:String; rdfs:range string:String . > > crypto:sha a daml:UnambiguousProperty, > daml:UniqueProperty; rdfs:label "sha"; > rdfs:comment "The SHA hash of a string"; > rdfs:domain string:String; rdfs:range string:String . I notice your higher trust of sha1! > Upper-case property names are used in the crypto.n3 schema [1] (which > needs chACL-ing), but I prefer to use lower-case for properties, and > upper-case (prefixes) for classes, so I changed them. agreed. > To get the hash of a file, you of course have to use log:content on > it... I did consider just putting in a built-in function that would do > that for you, but it seems more sensible to deploy one standard > approach. I agree. The string is the fundamental thing. If you want to read or write files or something, then that should be separate. KIt is easy enough to define content hashing functions. One question, to be purist, is to whether the hashing and the base64 encoding should be split. I thought that we would need separate functions to decode and encode base64, because they are not inverse, because the decoding is many to one, and the encoding has to be one-one! The convention for naming them is tricky base42 and anyBase64? cannonicalBase64 and base64? What happens is we cheat and call base64 a function and a reverse function, and allow it to make an arbitrary choice in the encoding seqiuence? I say "foo" :base64 :y, meaning "foo" has base64 encoding y, and then it generates :y so that it is *one* of the strings which will make "foo" :base64 :y true. Does this lead to any logical inconsistencies? Yes, I think you get problems when you are, say, checking a set of strings to see which one is the base64 encoding of foo. { "foo" :base64 :y; :y a :inputString } log:implies { ...} The query engine would generate a string for y, and then check class membership, which would (probbaly) fail, instead of finding all menbers of :inputString and then checking each one for being a valid base64 encoding of foo. I don't like verbs in properties (base64-encode, -decode, -generate etc) Maybe :cannonicalBase64 indicting its 1-1 nature, which is the declaritive aspect of the encoding function. > I also considered using a new "hash:" URI scheme to identify > hashes as first class objects on the Web, but after considering it > carefully, decided not to. I think some of those schemes exist. I think you made the right choice. We can add a { ( "md5:" [is crypto:md5 of [is log:content of :x]] ) string:concatenation [is log:uri of :y ] log:implies { :x = :y }. rule if we need to generate those, I suppose. > For signatures, I intend to hack on the mxCrypto stuff, but you need > an incredible amount of stuff in order to install that, so that's on > the TODO list :-) I looked at the amkCrypto, which is billed as a temporay fixing fork of mxCrypo. I couldn't figure out in the few minutes I gave it what else I had to do to make it work. It didn't seem too huge - maybe I was missing masses of C code. It is great that you have taken this on! A todo list shared .... Tim > Cheers, > > [1] http://www.w3.org/2000/10/swap/crypto.n3 > > -- > Kindest Regards, > Sean B. Palmer > @prefix : <http://webns.net/roughterms/> . > :Sean :hasHomepage <http://purl.org/net/sbp/> . >
Received on Saturday, 1 December 2001 20:51:40 UTC