RE: AES-GCM Algorithm is missing tag properties for input params and result value from Vijay Bharadwaj on 2013-05-13 (public-webcrypto@w3.org from May 2013)

From: Vijay Bharadwaj <Vijay.Bharadwaj@microsoft.com>
Date: Mon, 13 May 2013 05:55:54 +0000
To: Ryan Sleevi <sleevi@google.com>, Mike Jones <Michael.Jones@microsoft.com>
CC: Marcin Stankiewicz <Marcin.Stankiewicz@microsoft.com>, "public-webcrypto@w3.org" <public-webcrypto@w3.org>, Israel Hilerio <israelh@microsoft.com>
Message-ID: <f277574cc64442ef89c80f4a7688c2ea@DFM-CO1MBX15-08.exchange.corp.microsoft.com>
I see this as two distinct issues.

1. Separating tag from ciphertext instead of RFC 5116. I'm in favor of separating these because if you don't then you are effectively designing a serialization format and tying yourself to it - and it's really an information-poor serialization format with implied field lengths and no explicit delimiters. As far as I can tell, RFC 5116 is a hack for one specific purpose - you have an API or protocol that allows algorithm extensibility but does not understand the concept of authenticated encryption, so you create an authenticated encryption plugin in this way.

2. Talking about how flexible we want to be wrt parameter selection in GCM. I'm perfectly fine with imposing reasonable constraints here - supply all AAD before the message for encryption, restrict tag length to 12 bytes, and so on, if we can't find a use case for other parameter sets.

#1 is different from #2 because #1 applies to all AEAD algorithms and not just GCM. It's unclear what constraining ourselves in the RFC 5116 way buys us. Note that an implementation which returns ciphertext and tag separately can be built on top of an underlying API that uses an RFC 5116 style approach - that is actually better since we let the UA implementation handle the quirks of parsing a particular implementation's serializations instead of tossing these issues in the lap of the web developer.

-----Original Message-----
From: Ryan Sleevi [mailto:sleevi@google.com] 
Sent: Sunday, May 12, 2013 6:09 PM
To: Mike Jones
Cc: Marcin Stankiewicz; public-webcrypto@w3.org; Israel Hilerio
Subject: Re: AES-GCM Algorithm is missing tag properties for input params and result value

On Sun, May 12, 2013 at 5:55 PM, Mike Jones <Michael.Jones@microsoft.com> wrote:
> Even if we make assumptions about the IV and Tag sizes, I still 
> believe we'll have a cleaner, more general interface if we keep the 
> inputs and outputs separate, rather than assuming that they're 
> concatenated in a particular manner.

While not wanting to belabor the point too much, because in regards to RFC 5116 I don't feel strongly - but the whole point of RFC 5116 concatenating the inputs is precisely to make a cleaner, more general interface that can support both authenticated and non-authenticated encryption.

We can already see the complexity that arises from treating the low-level inputs as distinct. Further, several APIs that predate AAD
(eg: most deployed APIs), including PKCS#11, do not lend themselves well to supporting multiple distinct inputs and outputs.

Now, we can debate about whether such APIs are good APIs or not, and whether or not we should follow those same paths, but I am just looking at it pragmatically from an implementation side of making sure we're not specing something that cannot be implemented beyond a single API family.

>
> And the generality of inputs and outputs comes from RFC 4106 - not CNG.

Except 4106 is just mirroring the GCM spec, which is really where this arises from. And as we've seen - with our discussions of PKCS#1 v1.5 vs PKCS#1 v2.1 vs APIs that actually exist, the specs tend to generalize their parameters for a variety of cases, whereas implementations often focus on specific aspects. Whether this be the fact that GCM itself is not directly tied to AES (after all, it's just a block mode of encryption...) or whether it be the variable nature of MGFs vs Salt functions, it's not at all uncommon to see many more inputs and outputs that exist in pure spec form than exist in actual API form.

We can go down the route of spec'ing it as dictionary inputs/outputs, but I'm not really sure it argues as a 'general' case.

Since those reading may not be fully familiar with my concerns with this approach, let me try to elaborate further:

- What should the inputs to .process() be?
  * An ArrayBuffer(View) of ciphertext?
    - Does this mean the AAD must be fully specified up front?
  * A dictionary of Ciphertext & AAD
    - Is ciphertext optional?
    - Is AAD optional?

APIs such as CNG support progressive updates of the AAD and ciphertext in their streaming mode, while APIs such as PKCS#11 require the full AAD be specified *before the algorithm can even be initialized*. This is especially relevant when considering algorithm detection from the underlying implementation.

The realities of dealing with "legacy APIs" are particularly unique for cryptographic APIs, since the entire goal for browsers is, as much as possible, to *not* implement any of this within the user agent.
This is because in many countries, crypto is still munitions/export controlled, so the less crypto a browser has (and the more it can defer to external libraries), the better. So the realities of the APIs we have today - as imperfect as they are - continue to be a constraint on the APIs we design for tomorrow.

>
> -- Mike
>
> ________________________________
> From: Ryan Sleevi
> Sent: 5/12/2013 5:19 PM
>
> To: Mike Jones
> Cc: Marcin Stankiewicz; public-webcrypto@w3.org; Israel Hilerio
> Subject: Re: AES-GCM Algorithm is missing tag properties for input 
> params and result value
>
> On Fri, May 10, 2013 at 12:32 AM, Mike Jones 
> <Michael.Jones@microsoft.com> wrote:
>> JOSE rejected using RFC 5116 encodings because the library 
>> implementations of the existing authenticated encryption algorithms 
>> almost always provide separate inputs for the key, initialization 
>> vector, and plaintext, and separate outputs for the ciphertext and 
>> authentication tag.  It's therefore a more natural mapping from 
>> existing crypto libraries to the JOSE encoding by not combining any 
>> of the inputs or outputs, as RFC 5116 proposes to do - making life 
>> easier for implementers.
>>
>
> Interesting. I'm only aware of CNG doing this. Other APIs - including 
> the industry standard PKCS#11 - provide the RFC 5116 style interface.
>
> That's not to say we can't do things differently from 5116, but it's 
> important to understand one way or the other when we make these 
> decisions, for consistency sake.
>
>>
>>
>> Also, by not making assumptions about how long the IV and tag values 
>> are (which is required if you're going to use RFC 5116), we can 
>> support the full generality of the legal inputs and outputs of GCM 
>> and other authenticated encryption algorithms.  Otherwise, you have 
>> to, for instance, fix the IV to
>> 96 bits and the tag to 128 bits, which is a fine set of defaults, but 
>> not always how the algorithms are used in practice.  WebCrypto, being 
>> a low-level API, should support the general form, where the sizes of 
>> the IV and tag are chosen by the caller, rather than an RFC 5116 
>> specialization of the algorithms.
>
> That's interesting that you mention the IV vs nonce in the same 
> message as talking about looking at other libraries.
>
> The vast majority of implementations that I've examined are hardcoded 
> with assumptions about the IV and nonce and do not allow applications 
> to customize options according to the full range of GCM.
>
> Is this perhaps another example of a CNG-specific aspect creeping in 
> to discussions?
>
> Similar to past discussions about MGF customization for PSS/OAEP, I'm 
> not confident we're going to be able to support distinct IV sizes in a 
> way that will be interoperable between many (existing) libraries. So 
> we have to choose whether to spec something optimistically (that won't 
> likely get implemented for several years) or to spec it pragmatically, 
> knowing it's not as "full featured" (but knowing no real protocols 
> make use of that...)
>
> Truncated nonces are easier - the caller always supplies the length 
> anyways - so it also doesn't seem to have any of the ambiguity you 
> suggest would exist.
>
>>
>>
>>
>>                                              -- Mike
>>
>>
>>
>> From: Ryan Sleevi [mailto:sleevi@google.com]
>> Sent: Thursday, May 09, 2013 10:33 PM
>> To: Mike Jones
>> Cc: Marcin Stankiewicz; public-webcrypto@w3.org; Israel Hilerio
>> Subject: RE: AES-GCM Algorithm is missing tag properties for input 
>> params and result value
>>
>>
>>
>> Mike,
>>
>> Can you explain why?
>>
>> On May 9, 2013 10:09 PM, "Mike Jones" <Michael.Jones@microsoft.com> wrote:
>>
>> JOSE decided to keep the tag separate and not use RFC 5116. I think 
>> we should too.
>>
>> -- Mike
>>
>> ________________________________
>>
>> From: Ryan Sleevi
>> Sent: 5/9/2013 5:36 PM
>> To: Israel Hilerio
>> Cc: public-webcrypto@w3.org; Marcin Stankiewicz
>> Subject: Re: AES-GCM Algorithm is missing tag properties for input 
>> params and  result value
>>
>> On Thu, May 9, 2013 at 5:13 PM, Israel Hilerio 
>> <israelh@microsoft.com>
>> wrote:
>>> Another piece of feedback we provided during the F2F was that the 
>>> input parameters for AES-GCM need a tag property and the result 
>>> value for AES-GCM needs to contain a tag parameter in addition to 
>>> the ciphertext.
>>>
>>> This implies that we'll need to update the AesGcmParams dictionary 
>>> the following way:
>>>
>>> dictionary AesGcmParams : Algorithm {
>>>   // The initialization vector to use. May be up to 2^56 bytes long.
>>>   ArrayBufferView? iv;
>>>   // The additional authentication data to include.
>>>   ArrayBufferView? additionalData;
>>>   // The desired length of the authentication tag. May be 0 - 128.
>>>   [EnforceRange] octet? tagLength;
>>>
>>>
>>>   // The authentication tag value for decryption
>>>   ArrayBufferView? tag;
>>>
>>>
>>> };
>>
>> If we go this route, I would rather see two dictionaries, so that it 
>> can be unambiguous that tag is required for decrypt, but not for 
>> encrypt.
>>
>>>
>>> This changes will impact section 20.12.3 of the spec.
>>>
>>> In addition, since we can't return dictionaries I was thinking we 
>>> could add a new interface:
>>>
>>> interface AesGcmResult {
>>>         readonly attribute ArrayBuffer? tag;
>>>         readonly attribute ArrayBuffer ciphertext; }
>>>
>>> This will impact the result values of section 20.12 AES-GCM 
>>> registration table. We'll have to return this new interface for encrypt only.
>>> Alternatively, we could return it also for decrypt with the tag 
>>> value being null or set from the input parameter.
>>
>> Agreed on this affecting encrypt only.
>>
>>>
>>> Let us know what you think.
>>>
>>> Thanks,
>>>
>>> Israel
>>>
>>>
>>
>> I seem to recall some discussion/debate about whether or not it 
>> should follow the mode of RFC5116/PKCS#11 and treat the tag as a 
>> suffix of the ciphertext. I believe Richard was a proponent of this, 
>> and that certainly reflects the 'intent' of the current spec 
>> (although, as we both noted - presently woefully underspecified)
>>
>> The benefit of that is similar to those benefits expounded upon in
>> RFC5116 - it treats AEAD encryption using a consistent interface as 
>> non-authenticated encryption, at least in terms of inputs/outputs.
>>
>> I'm curious for the arguments for splitting them up, since we did not 
>> have sufficient time during the F2F to discuss this point. Is there a 
>> technical advantage, or is this for syntactic purposes?
Received on Monday, 13 May 2013 05:56:59 UTC