Re: Streaming - [Re: CryptoOperation and its life cycle] from Aymeric Vitte on 2012-12-13 (public-webcrypto-comments@w3.org from December 2012)

From: Aymeric Vitte <vitteaymeric@gmail.com>
Date: Thu, 13 Dec 2012 23:27:29 +0100
To: Ryan Sleevi <sleevi@google.com>
CC: public-webcrypto-comments@w3.org
Message-ID: <50CA5651.6050009@gmail.com>
What I would like to do seems not easy (of course one shot operations 
are easier), but it's not a marginal case, it has been requested a lot 
of time on some projects (node.js) and a living example is 
https://github.com/Ayms/node-Tor (both for hash and encryption)

The proposed solution is :

var h1 = window.crypto.digest("SHA1");
var h2 = window.crypto.digest("SHA1");
h1.process(stream1);
h2.process(stream1);
h2.process(stream2);
h1.finish();
h2.finish();

This is not a streaming solution, and I would not promote clones as well 
at all.

But the promises style (which I believe just complicate a lot the 
proposal although almost everybody seem to agree that's the right way to 
spec things) does not make clear what happens to the list of pending 
datas while different process methods are invoked successively, first I 
wrote :

H.process(stream1);
H.process(stream2);
H.onprogress=function() { console.log(this.result)};

Then I thought that it was completely wrong (is stream2 processed by 
H.process(stream1) ?)

Unlike other APIs, you don't have any update method, but why not just 
simply when process method ends :

- return digest of process method
- put remaining blocks (ie the ones that would not have been processed 
by un update method) in pending datas as the oldest ones

But still I don't know what would fire onprogress exactly...

Again, I might be misreading, the intention here is to move forward, not 
to complicate things



Le 13/12/2012 19:49, Ryan Sleevi a écrit :
> On Thu, Dec 13, 2012 at 2:16 AM, Aymeric Vitte <vitteaymeric@gmail.com> wrote:
>>> onprogress follows the Progress Events model, in which the client is
>>> informed of progress. There is always at least one onprogress event
>>> (which may be due to the final completion of the data), and there is
>>> always zero or one oncomplete events. At the oncomplete event firing,
>>> all of the data is available in result.
>>>
>>> This was raised as a point of concern by Wan-Teh back on June 20th,
>>> but it arguably follows the model of what existing APIs (such as File
>>> API or Streams API) do through their readAsArrayBuffer methods, and
>>> with how XMLHttpRequest makes data available through progress events.
>>>
>>> To be clear: .result contains the data available, and may grow to add
>>> more data, up and until oncomplete is fired.
>>>
>> The specs say : "an interface to support streaming/progressive output has also been requested. How such an interface would be implemented, if at all, remains TBD."
>
> This is talking in particular about accepting Blob objects from the
> File API ( http://dev.w3.org/2006/webapi/FileAPI/ ) **and returning
> Blob**, or accepting Stream objects from the Streams API (
> http://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm ) **and
> returning Stream objects**
>
>>
>> I assume this is related to the case below, current implementations of streaming can do something like :
>>
>> var H=new Hash('sha1');
>> H.update(stream1);
>> var res1=H.digest(stream1);
>> H.update(stream2);//hash stream1+stream2
>> var res2=H.digest(stream2);
>> etc...
>>
>> Which can become with Webcrypto something like :
>>
>> var H=(new Hash('sha1')).digest();
>> H.process(stream);//stream1
>> H.onprogress=function() {
>>      console.log(this.result);
>>      this.process(stream);//stream2
>> };
>>
>>
>> Apparently this will return stream1 hash and stream2 hash (not stream1 hash followed by stream1 + stream2 hash), probably I am misreading something because it looks therefore useless to call several time "process" for the same CryptoOperation object and the list of pending data is only feeded by "process" which empty it, then for now it's not really a list (should not the list of pending data be updated when process ends ?).
> There is presently **no** requirement that the underlying
> implementation support multi-part operations. The language was worded
> in such that a browser implementation MAY synthesize multi-part
> operations under the hood into a single operation.
>
> Syntactically, the call sequence under the current Editor's Draft (
> https://dvcs.w3.org/hg/webcrypto-api/raw-file/f5e8d9a3e18f/spec/Overview.html
> ) is
>
> var h = window.crypto.digest("sha1");
> h.process(stream1);  // MUST be an ArrayBufferView
> h.process(stream2);  // MUST be an ArrayBufferView
> h.finish();
>
> Given that API ONLY defines SHA-family hashes at present, and that
> there is NO incremental hashing supported by these constructs (since
> the final block contains both padding and the finalized length), it
> makes no sense to to return the intermediate parts.
>
> Really, what I think you're asking about is ISSUE-22, which asks where
> CryptoOperations should be clonable. IF they were (and I presently
> don't think so yet), THEN you would write something like
>
> var h1 = window.crypto.digest("SHA1");
> h1.process(stream1);  // MUST be an ArrayBufferView
> var h2 = h1.clone();
> h1.finish();
> h2.process(stream2);  // MUST be an ArrayBufferView
> h2.finish();
>
> Upon invoking their oncomplete callbacks for h1 and h2, h1.result ==
> H(stream1) and h2.result == H(stream1+stream2);
>
> However, like I said, I am generally opposed to clone methods (not
> structured clone, but explicit clone), in particular when the
> object-being-cloned is an EventTarget, as I think it creates confusion
> for what to do with pending tasks in the HTML Event Loop when the
> object is cloned during a task? Do they get cloned as well? If not,
> it's possible for h2.result == H(Stream2), which would not be
> expected.
>
> In short, the only way to do what you're asking today is
>
> var h1 = window.crypto.digest("SHA1");
> var h2 = window.crypto.digest("SHA1");
> h1.process(stream1);
> h2.process(stream1);
> h2.process(stream2);
> h1.finish();
> h2.finish();
>
> As you can see, multi-part operations, streaming, and cloning are
> rather complex issues, whereas 'single shot' operations are much
> clearer:
>
> var h1 = window.crypto.digest("SHA1", [ stream1 ]);
> // no .process() method, no .finish() method. Only digests the data
> supplied in the .digest call
> var h2 = window.crypto.digest("SHA1", [ stream1, stream2 ]);
> // no .process(), no .finish(). Only digests the data supplied in the
> .digest call
>
>>
>> PS : typos in the spec
>> 12.1.2.3bis "Remove data from the list of pending data." --> "Remove item from the list of pending data."
>> 19.3.4 "Upon invoking init: " --> what init method ?
>>
>> --
>> jCore
>> Email :  avitte@jcore.fr
>> GitHub : https://www.github.com/Ayms
>> Web :    www.jcore.fr
>> Webble : www.webble.it
>> Extract Widget Mobile : www.extractwidget.com
>> BlimpMe! : www.blimpme.com
>>

-- 
jCore
Email :  avitte@jcore.fr
GitHub : https://www.github.com/Ayms
Web :    www.jcore.fr
Webble : www.webble.it
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com
Received on Thursday, 13 December 2012 22:25:08 UTC