- From: Eric Roman <ericroman@google.com>
- Date: Mon, 14 Nov 2016 12:16:25 -0800
- To: Artem Skoretskiy <tonn81@gmail.com>
- Cc: "public-webcrypto@w3.org" <public-webcrypto@w3.org>
- Message-ID: <CAFswn4n-unCg6NbCzfWhJy2euTLR56+J5sf-JyjjSB7ZUF2ShQ@mail.gmail.com>
There have been alternate proposals for addressing this with Streams ( https://github.com/w3c/webcrypto/issues/73), which was tabled as a possible feature for a next version. Note that past revisions of the Web Crypto spec used an API similar to the one you are proposing. See: https://www.w3.org/TR/2012/WD-WebCryptoAPI-20120913/#Crypto-method-createDigester. So this approach was at least considered, however I believe it was abandoned in favor of the simpler one-shot + Promise approach (and looking towards Streams for possibly addressing t he multi-part use case in the future). I do agree with you that for certain applications the asynchronous (and one-shot) interface, for SHA digests in particular, is inconvenient/impractical. On Sat, Nov 5, 2016 at 6:45 AM, Artem Skoretskiy <tonn81@gmail.com> wrote: > Dear W3C group, > > I have a feedback regarding your "digest" method https://w3c.github.io/ > webcrypto/Overview.html#SubtleCrypto-method-digest > > That is great we could have a native hashes calculation in the browser. > However, there are some missing parts to make it usable. > > At the moment, you must pass complete buffer into digest method, e.g.: > > window.crypto.subtle.digest('SHA-1', new TextEncoder("utf-8").encode('Hello > world!')).then(function(digest){ > console.log(digest); > }) > > That is completely fine till your content is small. Once you start to deal > with content in size of Gigabytes or Terabytes, you are stuck. > > With current implementation you need to read complete content into RAM, > that makes heavy use of the RAM and also brings a limit on the content you > could handle. > > Yes, usually all the content is in the RAM, but there are several cases > when it is not: > > - File (selected by a user) > - Content that is generated on fly, e.g. PDF or ZIP > > In my scenario I'm hashing user files before uploading them to a cloud (so > that we don't upload already uploaded files). With current standard I > cannot handle big files, e.g. 3GB in size. > > I would propose digest to be iterative so you could generate hash by > chunks and keep RAM usage log. > > For example: > > var hash = new window.crypto.subtle.digest('SHA-1'); > hash.update(TextEncoder("utf-8").encode('Hello')) > hash.update(TextEncoder("utf-8").encode(' world!')) > > hash.digest().then(function(digest){ > console.log(digest); > }); > > That is pretty common practice for hashing in modern languages. E.g. for > Python: > > import hashlib > > digest = hashlib.sha1() > digest.update('Hello') > digest.update(' world!') > print(digest.hexdigest()) > > d3486ae9136e7856bc42212385ea797094475802 > > > That would solve my use case (I would generate hash by chunks) and reduce > memory footprint in other scenarios with big content. > > Alternatively, you could allow providing File / Blob as input as well as > Buffer. Then browsers would need to implement efficient reading and hashing > by chunks then. For me as a developer -- that would be easier, but less > flexible. > > var file = new File([""], "filename"); > window.crypto.subtle.digest('SHA-1', file).then(function(digest){ > console.log(digest); > }); > > I hope that change would take place in future revisions to make > cryptographic hashing a first-class citizen in browsers. > > -- > Truly yours, > Artem Skoretskiy >
Received on Monday, 14 November 2016 20:16:59 UTC