RE: Strawman proposal for the low-level API from Vijay Bharadwaj on 2012-06-30 (public-webcrypto@w3.org from June 2012)

From: Vijay Bharadwaj <Vijay.Bharadwaj@microsoft.com>
Date: Sat, 30 Jun 2012 07:14:23 +0000
To: Ryan Sleevi <sleevi@google.com>, David Dahl <ddahl@mozilla.com>
CC: "public-webcrypto@w3.org" <public-webcrypto@w3.org>
Message-ID: <382AD43736BB12439240773F68E9077397B1AB@DF-M14-24.exchange.corp.microsoft.com>
Ryan,

Once again, apologies for the delay in responding to this.

This is a very good start. Some thoughts/questions:


1.       Why does result have to be a scalar type? It would seem more natural to have result be an object - that way you can return things like the ciphertext and tag for AEAD separately without mandating a specific serialization format in this low-level API. I am not advocating going as far as the high-level interface of SJCL which makes the result fully self-describing by including algorithm, parameters, IV, etc. (though that has its advantages) but just enough to avoid artificially collapsing multiple outputs into a single scalar.

2.       How does one support streaming mode? E.g. some media streaming protocols use AES-CTR. Were you thinking that implementations would use onprogress where feasible?

3.       I'm not thrilled about the DOMstring variant of processData. As you've noted, this is underspecified and this leads to all kinds of subtle issues. For instance a lot of REST protocols appear to use UTF-8 or base64. I'd rather just specify the interface in terms of binary blobs unless there is a real improvement in usability. In this case I wonder if it's worth the effort to accept DOMstring.

4.       I see where you are going with the algorithm parameters structure, but I think it really needs to be broken into two distinct parts. As it stands, this structure conflates algorithm choices such as tag length and hash with per-operation quantities such as IV and nonce. The former are relevant to algorithm discovery; the latter should not be.

5.       You've opted for a "message in" signature interface instead of a "hash in" interface. This certainly seems reasonable, but I will tell you (from experience maintaining CAPI, which uses message-in, and CNG, which uses hash-in) that people are always coming up with use cases that require hash-in for low-level APIs. Often this is in cases where the computation of the hash is temporally or spatially separated from the actual signing operation.

From: Ryan Sleevi [mailto:sleevi@google.com]
Sent: Wednesday, June 20, 2012 1:17 PM
To: David Dahl
Cc: public-webcrypto@w3.org
Subject: Re: Strawman proposal for the low-level API


On Wed, Jun 20, 2012 at 12:42 PM, David Dahl <ddahl@mozilla.com<mailto:ddahl@mozilla.com>> wrote:
----- Original Message -----
> From: "Ryan Sleevi" <sleevi@google.com<mailto:sleevi@google.com>>
> To: public-webcrypto@w3.org<mailto:public-webcrypto@w3.org>
> Sent: Monday, June 18, 2012 12:53:03 PM
> Subject: Strawman proposal for the low-level API
>
> Hi all,
>
>    While I'm still in the process of learning WebIDL [1] and the W3C
>    Manual
> of Style [2], I wanted to take a quick shot at drafting a strawman
> low-level API for discussion.
This is great, thanks for taking the time.

>
> First, a bit of the IDL definition, to set the stage. This is also
> using
> using ArrayBuffer from TypedArray [6], which I'm not sure if it's
> altogether appropriate, but it's been incorporated by reference into
> FileAPI [7], so it seems alright to use here.
>
I think so. ArrayBuffers seem a natural fit for this API.

> [interface]
> interface CryptoStream : EventTarget {
>   void processData(ArrayBuffer buffer);
>   void processData(DOMString data);
The flexibility of accepting either a string or ArrayBuffer is a good idea, with an internal, seamless conversion.

I'm not sure whether it should be a literal ArrayBuffer or if it should be an ArrayBufferView. In looking at more specs, I suspect the latter is actually more correct.

Well, no, it's not necessarily a seamless conversion :-) DOMString is UTF-16, so the conversion into a byte sequence is problematic if underspecified (eg: as I've unfortunately done here)

Representation of binary data via DOMString is a known problematic area (eg: see WHATWG's work on StringEncoding via the TextEncoder/TextDecoder interface).

Within the W3C, I understand this is part of ongoing discussions in public-webapps.


>   void complete();
>
>   readonly attribute (DOMString or ArrayBuffer)? result;
>
>   attribute [TreatNonCallableAsNull] Function? onerror;
>   attribute [TreatNonCallableAsNull] Function? onprogress;
>   attribute [TreatNonCallableAsNull] Function? oncomplete;
> };
>
> dictionary AlgorithmParams {
> };
>
> dictionary Algorithm {
>   DOMString name;
>   AlgorithmParams? params;
> };
>
> [NoInterfaceObject]
> interface Crypto {
>   CryptoStream encrypt(Algorithm algorithm, Key key);
>   CryptoStream decrypt(Algorithm algorithm, Key key);
>
>   // Also handles MACs
>   CryptoStream sign(Algorithm algorithm, Key key);
>   CryptoStream verify(Algorithm algorithm, Key key, ArrayBuffer
>   signature);
>
>   CryptoStream digest(Algorithm algorithm);
>
>   // This interface TBD. See discussion below.
>   bool supports(Algorithm algorithm, optional Key key);
>
>   // Interfaces for key derivation/generation TBD.
> };
>
>
> As you can see, CryptoStream is used for all of the actual crypto
> operations. That's because, in looking at the operations, I think all
> of
> them will work on a series of calls to provide input, and the result
> of
> which is either: error, some data output, or operation complete.
>
> The real challenge, I think, lies in the AlgorithmParams structure,
> which
> is where all of the algorithm-specific magic happens. My belief is
> that we
> can/should be able to define this API independent of any specific
> AlgorithmParams - that is, we can define the generic state machine,
> error
> handling, discovery. Then, as a supplemental work (still within the
> scope
> of the primary goal), we define and enumerate how exactly specific
> algorithms are implemented within this state machine.
>
> To show how different AlgorithmParams might be implemented, here's
> some
> varies definitions:
>
> // For the 'RSA-PSS' algorithm.
> dictionary RsaPssParams : AlgorithmParams {
>   // The hashing function to apply to the message (eg: SHA1).
>   AlgorithmParams hash;
>   // The mask generation function (eg: MGF1-SHA1)
>    AlgorithmParams mgf;
>   // The desired length of the random salt.
>   unsigned long saltLength;
> };
>
> // For the 'RSA-OAEP' algorithm.
> dictionary RsaOaepParams : AlgorithmParams {
>   // The hash function to apply to the message (eg: SHA1).
>    AlgorithmParams hash;
>   // The mask generation function (eg: MGF1-SHA1).
>    AlgorithmParams mgf;
>   // The optional label/application data to associate with the
>   signature.
>   DOMString? label = null;
> };
>
> // For the 'AES-GCM' algorithm.
> dictionary AesGcmParams : AlgorithmParams {
>   ArrayBufferView? iv;
>   ArrayBufferView? additional;
>   unsigned long tagLength;
> };
>
> // For the 'AES-CCM' algorithm.
> dictionary AesCcmParams : AlgorithmParams {
>   ArrayBufferView? nonce;
>   ArrayBufferView? additional;
>   unsigned long macLength;
> };
>
> // For the 'HMAC' algorithm.
> dictionary HmacParams : AlgorithmParams {
>   // The hash function to use (eg: SHA1).
>   AlgorithmParams hash;
> };
>
>
> The API behaviour is this:
> - If encrypt/decrypt/sign/verify/digest is called with an unsupported
> algorithm, throw InvalidAlgorithmError.
> - If " is called with an invalid key, throw InvalidKeyError.
> - If " is called with an invalid key/algorithm combination, throw
> UnsupportedAlgorithmError.
> - Otherwise, return a CryptoStream.
>
> For encrypt/decrypt
> - The caller calls processData() as data is available.
> - If the data can be en/decrypted, it will raise an onprogress event
> (event
> type TBD).
>   - If new (plaintext, ciphertext) data is available, .result will be
> updated. [This is similar to the FileStream API behaviour]
> - If the data cannot be en/decrypted, raise the onMGF1-SHA1error with an
> appropriate
> error
> - The caller calls .complete() once all data has been processed.
>   - If the final block validates (eg: no padding errors), call
>   onprocess
> then oncomplete.
>   - If the final block does not validate, call onerror with an
>   appropriate
> error.
>
> For authenticated encryption modes, for example, the .result may not
> contain any data until .complete has been called (with the result
> data).
>
> For sign/verify, it behaves similarly.
> - The caller calls processData() as data is available.
> - [No onprogress is called/needs to be called?]
> - The caller calls .complete() once all data has been processed
> - For sign, once .complete() is called, the signature is generated,
> and
> either onprogress+oncomplete or onerror is called. If successful, the
> resultant signature is in .result.
> - For verify, once .complete() is called, the signature is compared,
> and
> either onprogress+oncomplete or onerror is called. If the signatures
> successfully matched, .result will contain the input signature (eg:
> the
> constant-time comparison happens within the library). If the
> signatures
> don't match, .result will be null and the error handler will have
> been
> called.
>
> Finally, for digesting, it behaves like .sign/.verify in that no data
> is
> available until .complete() is called, and once .compete() is called,
> the
> resultant digest is in .result.
The final result of any of these operations would have all result data passed into the oncomplete event handler, correct?

No. The oncomplete event handler follows the DOMCore event handling semantics. Since I didn't define a custom event type (eg: one that would curry the result), it would be expected that callers obtain the result via evt.target.result. evt.target is bound to an EventTarget, which the CryptoStream inherits from, and is naturally the target of the events it raises.

This is shown in the pseudo-code example of how evt.target.result is read.

But yes, for all successful operations (eg: no onerror callback), evt.target.result contains the data available. In the case of operations which yield "good" or "bad" (eg: MAC & Signature verification), the .result contains the verified data.

Note that I didn't spec Verify+Recovery, since I'm still mulling that one over, but if implemented, I would imagine verifyRecover would presumably have the recovered PT (rather than the original signature) in .result.



>
> What I haven't fully worked out is how key derivation/agreement will
> work -
> particularly if the result of some result of key agreement results in
> multiple keys (eg: how SSL/TLS key derivation works in PKCS#11). This
> is
> somewhat dependent on how we treat keys.
>
> Note that I left the Key type unspecified. It's not clear if this
> will be
> something like (Key or DOMString), indicating some either/or of
> handle /
> id, if it might be a dictionary type (with different naming
> specifiers,
> such as 'id' or 'uuid'), or if it will be a concrete type obtained
> via some
> other call (eg: .queryKeys()). I think that will be borne out over
> the next
> week or two as we continue to discuss key management/lifecycle.
>
> For a pseudo-code example:
>
> var stream = window.crypto.sign({ name: 'RSA-PSS', params: { hash: {
> name:
> 'sha1' }, mgf: { name: 'mgf-sha1' }, saltLength: 32 }}, key);
> stream.oncomplete = function(evt) { window.alert('The signature is '
> +
> e.target.result); };
> stream.onerror = function(evt) { window.alert('Signing caused an
> error: ' +
> e.error); };
>
> var filereader = FileReader();
> reader.onload = function(evt) {
> stream.processData(evt.target.result);
> stream.complete(); }
> filereader.readAsArrayBuffer(someFile);
>
>
> The FileAPI is probably not the best example of why the iterative API
> (.processData() + .complete()) is used, since FileReader has the
> FileReader.result containing all of the processed data, but it's
> similar
> than demonstrating a streaming operation that may be using WebSockets
> [8]
> or PeerConnection [9].
>
> Note that I think during the process of algorithm specification, we
> can
> probably get away with also defining well-known shorthand. eg:
> 'RSA-PSS-SHA256' would mean that the hash is SHA-256, the mgf is
> MGF1-SHA256, and only the saltLength needs to be specified (or should
> it be
> implied?)
Since this is a low-level API, perhaps we imply a sensible default, with the ability to override for properties like saltLength?

I think "sensible default" is actually quite appropriate for high-level, but not for low-level.

One of my biggest concerns with "sensible default" is that, once spec'd, you cannot ever change the defaults. This creates potential problems when we talk about deprecating or removing support for algorithms.

This is why I proposed the short-hand notation as an algorithm name, rather than being default/optional values on the Dictionary type.

For example, rewind ten years ago (or lets go 20, to be fair), and a sensible default for RSA signatures would be RSA-PKCSv1.5 + MD5 for the message digest function.

So applications get written using window.crypto.sign({'name': 'RSA' }, RsaKey);

Now, as we move forward in time, we discover that MD5 isn't all that great, and really people should be using SHA-1. However, we can not change the default for { 'name': 'RSA' }, because that would be a semantic break for all applications expecting it to mean MD5.

Further, as we continue moving forward in time, we discover that PKCSv1.5 isn't all that great, and PSS is much better. However, again, we cannot change the defaults, because it would break existing applications.

The result of being unable to change the defaults is that when /new/ applications are written, because they're not required to specify values (eg: there are defaults), they don't. The result is that new applications can end up using insecure mechanisms without ever being aware of it. By forcing the app developer to consider their parameters, whether explicitly via AlgorithmParams or implicitly via the algorithm 'name', it at least encourages 'best practice' whenever a new application is being written.

The argument for default arguments is very compelling - there is no doubt about it. The less boiler plate, arguably the better. However, for a low-level API, particularly one whose functionality is inherently security relevant, defaults tend to end up on the short-end of the security stick over time, and that does more harm than good.

That's why I provided the 'escape hatch' of defaults by using the algorithm name as a short-hand for the more tedious AlgorithmParams portion.


>
> Anyways, hopefully this straw-man is able to spark some discussion,
> and
> hopefully if it's not fatally flawed, I'll be able to finish adopting
> it to
> the W3C template for proper and ongoing discussions.
>
I like what you have here. I think this interface is elegant in the central concept of the CryptoStream being able to handle any operation possible for the algorithm. This interface is simpler to work with than my proposal.

Like Wan-Teh said in the meeting this week,  we should figure out how key generation works, the structure of the key handle, or,  extracted key data properties look like.

With the Algorithm and its AlgorithmParams are we headed down the path of maintaining a cipher suite for this API?

So, as I mentioned on the phone and the preamble, I think as a WG we can/should first focus on defining the practical parts of the API - eg: without defining any ciphers (MUST or SHOULD) - and what the semantic behaviours are for consumers of this API.

Following that, I think the WG can supplementally extend the spec to talk about different algorithms, modes, etc - eg: AES, RSA, HMAC, etc.

I think part of this is pragmatic - while we can talk about all the 'popular' suites of today (AES, RSA, SHA-1/2), there's no guarantees they'll be secure 'tomorrow' (ECC, SHA-3, SomeNewKDF), and putting those as MUST-IMPLEMENT imposes a real security cost going forward. Further, as we look across the space of devices that might implement this API - from beefy desktops, to resource constrained mobile devices, to game consoles, to who knows what - it seems we must also recognize that the ability to reasonably support some algorithms is simply not going to exist. Whether this is being unable to AES in cipher-text-stealing mode or DSA/DSA2, a lack of support for ECC or MD5, mandating algorithms won't do much to help adoption of the core API, I believe.

Yes, extensibility carries risks - vendor-specific encryption schemes may be added that aren't implemented by other user agents. However, this risk exists with just about any generic and usable web API defined by the W3C - we've seen it with custom HTML tags, custom CSS prefixes, <video> and <audio> algorithm support. I see the Algorithm/AlgorithmParams operating within that same space - something that can be (independently) standardized without changing this core API.

Note: I do think the Core API can define sensible values for the algorithms we know/care about, I just don't think it's a function of the core API to dictate what must be implemented, just how it will behave if it is implemented.

Cheers,
Ryan


Thanks again for putting this together, I think we should begin nailing down the hand wavy 'Keys' for this proposal.


Regards,

David
Received on Saturday, 30 June 2012 07:15:52 UTC