Re: Hash functions

On Mon, Dec 20, 2010 at 4:42 PM, Glenn Maynard <glenn@zewt.org> wrote:
> Has a hash functions API been considered, so browsers can expose, for
> example, a native SHA-1 implementation?  Doing this in JS is possible,
> but painfully slow, even with current JS implementations.
>
> Some fairly obvious use cases:
>
>  - Avoid uploading a file to the server if it already has a copy.  For
> example, if you attach a large file to an email, and you already have
> a copy of that file in your mailbox attached to another mail, don't
> upload the whole file; just send a reference the existing one.
>  - Resumable file uploads.  An implementation of a chunked, resumable
> uploader will want to validate that the file the user is sending is
> actually what's been received by the server so far, and roll back the
> transfer partially or completely if they're out of sync.
>  - Local file validation and updating.  A web-based game may want to
> save large blocks of resources locally, rather than depending on HTTP
> caching to do it, which is inappropriate for a game with several
> hundred megabytes or more of resources.  Native hashing would help
> automatic updating of data.
>
> If there's a more appropriate place for this, let me know.

I think this would be pretty useful too.  All the good hash functions
are designed to be run in hardware, which means they're slower than
necessary in Javascript, even ignoring js's difficulty in dealing with
raw binary data currently.

At least two hashes should be exposed - a fast one like SHA-1 for
things like checksumming, and a slow one like bcrypt for actual
security/signing uses.

A problem, of course, is that once a browser ships a particular algo,
it can't ever stop shipping it, because even if the author is smart
enough to watch for a changed algo (which most won't), they'll still
need to invalidate all of the currently hashed data or switch back to
a slow software one.  For the same reason, browsers can't ship a
default algorithm (that is, one that you get when you don't explicitly
specify an algorithm), because pretty much all code using the default
will immediately become legacy code that will break if the algo is
switched.

So, the choice of algos used has to be made very carefully.

~TJ

Received on Tuesday, 21 December 2010 00:58:28 UTC