Re: Explanation of deriveKey (was Re: ACTION-22: Key export) from Ryan Sleevi on 2012-08-28 (public-webcrypto@w3.org from August 2012)

From: Ryan Sleevi <sleevi@google.com>
Date: Tue, 28 Aug 2012 12:34:17 -0700
To: Vijay Bharadwaj <Vijay.Bharadwaj@microsoft.com>
Cc: Mark Watson <watsonm@netflix.com>, Mitch Zollinger <mzollinger@netflix.com>, "<public-webcrypto@w3.org>" <public-webcrypto@w3.org>
Message-ID: <CACvaWvYRj75q_xxtw-fY4sXT+40J_rBeddYDA0b_h1aKt3DJ2A@mail.gmail.com>
On Tue, Aug 28, 2012 at 11:58 AM, Vijay Bharadwaj
<Vijay.Bharadwaj@microsoft.com> wrote:
> Actually, I'm not sure this is well specified in the text currently. Neither classification is inconsistent with derivation taking an input key and generation not doing so. I'm certainly not proposing any differently.
>
> However, it would help to know what the deriver and generator return. I have been assuming (perhaps incorrectly) that generate would return a Key object while derivation would return an ArrayBuffer, but I don't think this is explicit in the text (and in fact I did request a clarification on this yesterday - see http://lists.w3.org/Archives/Public/public-webcrypto/2012Aug/0252.html)
>
> So maybe we're both right - I'll wait for Ryan to clarify the intention of the current API draft.

The answer is... I don't have a good answer.

Like I mentioned, the distinction of "generation" vs "derivation"
differs from API to API, and it seems to lack a consistent distinction
between the different APIs.

For example, with PBKDF2, you may be deriving a key for a specific
algorithm (eg: an AES or HMAC key), or you may be deriving some opaque
material to be used as a nonce. Likewise, with DH Phase 2 agreement,
you may be using the ZZ directly (perhaps splitting it into send and
receive MAC keys), or you may be performing additional expansion to
yield multiple keys (eg: X9.42)

If we say the distinction between Generate & Derive is their outputs,
then I expect we'd need to support DH Phase 2 both as a Generate
mechanism (where the peer key needs to be passed in as additional
input to the AlgorithmParameters) and as a Derive method (where the
peer key is passed in as an additional parameter to the method), which
seems odd.

After thinking about it more, I think I'm more inclined to suggest
their differences is in the Inputs - Generate doesn't take a "Key",
but Derive does - but even that seems arbitrary. What about multi-key
agreement (eg: X9.42)? Since Derive (currently) only supports a single
Key as input, any additional Keys would need to be supplied as
AlgorithmParameters - and once you do that, what distinction is it,
really, from simply Generate?

The dimensions that (Generate || Derive) seem to have are
(Algorithm) * (0, 1, n Keys as additional inputs) * (output of Key(s),
output of ArrayBuffer(s))


PKCS#11 terminology:
(Reminder: Key types: Secret, Public, Private. Secret keys may be
exportable, at which point, they MAY be opaque blobs o bytes)
- "C_GenerateKey generates a secret key or set of domain parameters,
creating a new object"
- "C_GenerateKeyPair creates a public/private key pair, creating new
key objects"
- "C_DeriveKey derives a key from a base key, creating a new key object."

CDSA
(Seems to split on symmetric vs asymmetric)
- CSSM_CSP_CreateDeriveKeyContext
"This function creates a cryptographic context to derive a symmetric
key given a handle of a CSP, an algorithm, the type of symmetric key
to derive, the length of the derived key, and an optional seed or an
optional AccessCredentials to derive a new key. The cryptographic
context handle is returned. The cryptographic context handle can be
used for calling the cryptographic derive key function"
- CSSM_CSP_CreateKeyGenContext
"This function creates a key generation cryptographic ocntext, given a
handle of a CSP, an algorithm identification number, a pass phrase, a
modulus size (for public/private key pair generation), a key size (for
symmetric key generation), a seed, and a salt. The cryptographic
context handle is returned. The cryptographic context handle can be
used to call key/keypair generation functions"

CryptoAPI
- "The CryptDeriveKey function generates cryptographic session keys
derived from a base data value. <snip> This function is the same as
CryptGenKey, except that the generated session keys are derived from
base data instead of being random. CryptDeriveKey can only be used to
generate session keys. It cannot generate public/private key pairs"
- "The CryptGenKey function generates a random cryptographic session
key or a public/private key pair"

CNG:
- "The NCryptCreatePersistedKey function creates a new key and stores
it in the specified key storage provider."
- "The NCryptDeriveKey function derives a key from a secret agreement
value. This function is intended to be used as part of a secret
agreement procedure using persisted secret agreement keys. To derive
key material by using a persisted secret instead, use the
NCryptKeyDerivation function."
- "The NCryptKeyDerivation function creates a key from another key by
using the specified key derivation function. The function returns the
key in a byte array."

JCA
- javax.crypto.KeyGenerator "This class provides the functionality of
a secret (symmetric) key generator"
- javax.crypto.KeyAgreement "This class provides the functionality of
a key agreement (or key exchange) protocol."
- java.security.KeyPairGenerator "The KeyPairGenerator class is used
to generate pairs of public and private keys."

>
> -----Original Message-----
> From: Mark Watson [mailto:watsonm@netflix.com]
> Sent: Tuesday, August 28, 2012 9:13 AM
> To: Vijay Bharadwaj
> Cc: Ryan Sleevi; Mitch Zollinger; <public-webcrypto@w3.org>
> Subject: Re: Explanation of deriveKey (was Re: ACTION-22: Key export)
>
>
> On Aug 28, 2012, at 8:18 AM, Vijay Bharadwaj wrote:
>
>> I think the difference is in whether one defines generation vs. derivation as a characteristic of the inputs or the outputs.
>>
>> My proposal (and my understanding of Ryan's proposal):
>>
>> - Generation = defined as an operation whose output has a specific
>> structure and is useful for a specific algorithm (i.e. a key)
>> - Derivation = an operation whose output is opaque "random-looking"
>> bytes
>>
>> My understanding of your proposal:
>>
>> - Generation = an operation which generates something (a key or
>> random-looking bytes) using entropy not supplied by the caller
>> - Derivation = an operation which generates something (a key or
>> random-looking bytes) purely as a function of inputs supplied by the
>> caller
>>
>> Did I understand your proposal correctly?
>
> Yes. And what I said aligns with the current API, where deriveKey takes an input Key and generateKey does not. With the first proposal above the difference in the API should be in how the output is specified.
>
> ...Mark
>
>>
>> -----Original Message-----
>> From: Mark Watson [mailto:watsonm@netflix.com]
>> Sent: Tuesday, August 28, 2012 8:12 AM
>> To: Vijay Bharadwaj
>> Cc: Ryan Sleevi; Mitch Zollinger; <public-webcrypto@w3.org>
>> Subject: Re: Explanation of deriveKey (was Re: ACTION-22: Key export)
>>
>>
>>
>> Sent from my iPhone
>>
>> On Aug 28, 2012, at 7:21 AM, "Vijay Bharadwaj" <Vijay.Bharadwaj@microsoft.com> wrote:
>>
>>>> To me it would make sense that "generation" is an operation that creates secret information "out of nowhere" (or rather, out of something that is invisible to this API), and "derivation" is an operation that creates secret information based on some other secret information already present and visible through the API (i.e. another Key).
>>>
>>> This is appealing, but has the problem that "generation" also requires the result to have algorithm-specific structure (e.g. product of two large primes) that "derivation" algorithms typically don't guarantee.
>>>
>>> Stated differently, this proposal seems to require "derivation" to also supply a target algorithm, which isn't always needed (e.g. when generating nonces through a KDF).
>>>
>>> My mental model is more in line with what Ryan proposed.
>>
>> I was trying to capture what Ryan was proposing. What difference do you see ?
>>
>> ...Mark
>>
>>>
>>> -----Original Message-----
>>> From: Mark Watson [mailto:watsonm@netflix.com]
>>> Sent: Monday, August 27, 2012 1:22 PM
>>> To: Ryan Sleevi
>>> Cc: Mitch Zollinger; <public-webcrypto@w3.org>
>>> Subject: Re: Explanation of deriveKey (was Re: ACTION-22: Key export)
>>>
>>> Ryan,
>>>
>>> Makes sense. Could you add some text clarifying this to the specification ?
>>>
>>> One comment below...
>>>
>>> On Aug 27, 2012, at 11:48 AM, Ryan Sleevi wrote:
>>>
>>>> On Mon, Aug 27, 2012 at 11:10 AM, Mark Watson <watsonm@netflix.com> wrote:
>>>>> Ryan, all,
>>>>>
>>>>> I think the specification could do with some better explanation of the scope of the deriveKey operation.
>>>>>
>>>>> There are a few things which I believe I understand now (based on the discussion below), but which are not at all clear in the specification as it stands. I think this could be a cause of confusion if not addressed for FPWD.
>>>>>
>>>>> First, the Key object can represent any kind of secrets, not just
>>>>> keys that are suitable for the encrypt/decrypt/sign/verify
>>>>> operations. Specifically, a Key object may represent
>>>>> - the public/private values generated in the first phase of a DH
>>>>> exchange (these are not "keys" in the usual sense)
>>>>
>>>> The public/private values here are typically referred to as "DH
>>>> private keys" and "DH public keys". Even RFC 2631 refers to them as
>>>> such.
>>>>
>>>>> - the shared secret generated as the output of the second phase of
>>>>> a DH exchange (this is not a "key" yet either)
>>>>
>>>> In PKCS#11 terms, a "shared secret" is known as a "secret key". In
>>>> CNG terms, the result of key agreement is called a SECRET_HANDLE,
>>>> although the SECRET_HANDLE is admittedly a distinct type than the
>>>> KEY_HANDLE type.
>>>>
>>>> I think I take the view that the API data-type hierarchy should be
>>>> similar to the PKCS#11 data-type hierarchy documented in Section 6.4
>>>> of PKCS#11 2.20 -
>>>> http://www.cryptsoft.com/pkcs11doc/STANDARD/pkcs-11v2-20.pdf
>>>>
>>>> Within the "Key" type, there are three types of keys that may
>>>> represented - a public key, a private key, and a secret key. For
>>>> secret keys, there may be a specific algorithm (ie: symmetric keys),
>>>> or it may be defined as a "generic" key (ie: a opaque blob-o-bytes)
>>>>
>>>>>
>>>>> Second, the deriveKey method can be used for any operation which takes one Key object and creates a new one by combining somehow with some other input information. This could be key derivation in the usual sense of extracting an appropriate amount of keying material from a secret and designating it as suitable for specific operation types. Or it could be Phase 2 of a DH key exchange, where the client private value is combined with the server public value.
>>>>
>>>> Yes.
>>>>
>>>> In CNG terms, this is the combination of the functionality provided
>>>> by the NCryptKeyDerivation and NCryptSecretAgreement functions.
>>>>
>>>> In PKCS#11 terms, this maps to C_DeriveKey.
>>>>
>>>> Admittedly, it's a problematic distinction to separate our
>>>> "generation" and "derivation". In PKCS#11 terms, for example,
>>>> PBKDF#2 is a key generation mechanism, while for CNG & CDSA, it's
>>>> listed under key derivation. For DH, in PKCS#11 & CDSA, it's
>>>> derivation, while for CNG, it's secret agreement (AFAICT).
>>>>
>>>> I'm not sure what the good criteria should be for why an algorithm
>>>> should be one rather than the other. Perhaps derivation should be
>>>> any operation that yields an opaque series of bytes, while
>>>> generation should be any operation that yields a key? I'm not
>>>> entirely sure here either, which is admittedly, problematic :)
>>>
>>> To me it would make sense that "generation" is an operation that creates secret information "out of nowhere" (or rather, out of something that is invisible to this API), and "derivation" is an operation that creates secret information based on some other secret information already present and visible through the API (i.e. another Key).
>>>
>>> If the operations are defined in those broad terms, then the DH and other usages all make sense.
>>>
>>> ...Mark
>>>
>>>>
>>>>>
>>>>> Of course, if I got either of the two things above wrong, there's
>>>>> even more need for explanation ;-)
>>>>>
>>>>> ...Mark
>>>>>
>>>>>
>>>>> On Aug 27, 2012, at 10:43 AM, Ryan Sleevi wrote:
>>>>>
>>>>>> On Sun, Aug 26, 2012 at 1:06 AM, Mitch Zollinger <mzollinger@netflix.com> wrote:
>>>>>>> Ryan,
>>>>>>>
>>>>>>> First off, thank you. Things are beginning to make more sense
>>>>>>> because of your detailed response. More inline below...
>>>>>>>
>>>>>>>
>>>>>>> On 8/25/12 1:10 PM, Ryan Sleevi wrote:
>>>>>>>>
>>>>>> <snip>
>>>>>>>> // Handles completion of Phase 2 of DH agreement function
>>>>>>>> onDHDeriveKeyComplete(keyDeriver) {  // zz is the result of the
>>>>>>>> Phase 2 of PKCS #3 and is equivalent to ZZ  // as documented in
>>>>>>>> X9.42 - aka the shared secret  // ZZ = g ^ (xb * xa) mod p  var
>>>>>>>> zz = keyDeriver.result;
>>>>>>>
>>>>>>>
>>>>>>> Is zz the actual shared secret, or is it an opaque handle at this point?
>>>>>>
>>>>>> For sake of discussion, let's say an opaque Key handle. That is, I
>>>>>> imagine where there are cases where it might be more useful to be
>>>>>> able to combine the (derive+export as opaque bytes) into a single
>>>>>> step, at which point you'd want zz to be an ArrayBuffer, but for
>>>>>> the sake of the example & discussion, I think using a Key object here is fine.
>>>>>>
>>>>>> <snip>
>>>>>>>> Am I also understanding that
>>>>>>>> this is being proposed as an OPTIONAL / MAY (not even a
>>>>>>>> normative RFC
>>>>>>>> SHOULD) - eg: not all user agents need to support
>>>>>>>> ProtectedKeyExchange?
>>>>>>>
>>>>>>>
>>>>>>> I don't want to add the ProtectedKeyExchange if we can meet the
>>>>>>> intended goals of a multi-step key exchange / derivation where no
>>>>>>> keying material created during the different phases is visible to
>>>>>>> the script code at any time.
>>>>>>
>>>>>> To make sure I'm on the same page - we're talking about ensuring
>>>>>> that the "Core API" (eg: excluding specific algorithm definitions)
>>>>>> makes no normative requirements on what the output of operations
>>>>>> MUST be, right? That is, to ensure the spec DOES NOT say that
>>>>>> every output MUST be an ArrayBuffer for this type of operation
>>>>>> (for
>>>>>> example)
>>>>>>
>>>>>> If so, I'd agree, that's a reasonable concern, and I would want to
>>>>>> ensure that whatever is normatively specified is the bare minimum
>>>>>> of algorithm-independent functionality, and to leave the majority
>>>>>> of normative behaviours to individual algorithms. The "Core API"'s
>>>>>> normative behaviours should focus on state machines and error
>>>>>> handling, while the algorithms themselves should define inputs &
>>>>>> outputs.
>>>>>>
>>>>>> Is that a reasonable understanding of the concerns?
>>>>>>
>>>>>> <snip>
>>>>>>>>
>>>>>>>> So, there's two meanings of protected key exchange here
>>>>>>>> - Protected from content script, but the content script is
>>>>>>>> allowed to 'drive' the operation. I think this need is already
>>>>>>>> met (as demonstrated by the pseudo-code)
>>>>>>>
>>>>>>>
>>>>>>> This is what we're aiming for.
>>>>>>>
>>>>>>>
>>>>>>>> - Protected from the user agent (as in, secure element
>>>>>>>> provisioning), which I think is, at best, secondary features,
>>>>>>>> but more likely out of scope in general.
>>>>>>>
>>>>>>>
>>>>>>> I get this point. It's still somewhat unclear where this goal is
>>>>>>> incompatible with the "protected from content script" goal given
>>>>>>> that the underlying implementation could call out to a HW element.
>>>>>>> But that's more of a curiosity question.
>>>>>>
>>>>>> What I mean is whether the spec should normatively mandate or
>>>>>> claim any protection from user agents.
>>>>>>
>>>>>> I want to ensure that the spec doesn't *mandate* a secure element
>>>>>> in order to implement the normative criteria. If the definition of
>>>>>> 'protected' means you distrust the user agent, then no user agent
>>>>>> that trusts or protects users can ever actually implement this,
>>>>>> short of outsourcing the crypto.
>>>>>>
>>>>>> My view is that a conforming user agent should be able to
>>>>>> implement this in terms of storing the secrets in plain text in
>>>>>> stable storage, even when a key has been flagged 'protected' (meaning:
>>>>>> protected from /future/ content script). That you /can/ implement
>>>>>> more protection -
>>>>>> eg: by using a secure element - is great, but by no means is it
>>>>>> mandated in the API that you MUST do so.
>>>>>>
>>>>>> Make sense?
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>>> In which case, your application would/should never request the
>>>>>>>> 'exportable' flag, and your problem should be solved.
>>>>>>>
>>>>>>>
>>>>>>> I believe this is one of the key points that I would like to make
>>>>>>> certain we agree on. Would I be correct in taking the above
>>>>>>> comment and expanding it in more detail:
>>>>>>> * Our application would never request "exportable = true".
>>>>>>> * If our application ever did request "exportable = true" the
>>>>>>> underlying implementation would throw an error.
>>>>>>> * Every phase in our key exchange + session key derivation,
>>>>>>> including the final stage, would have a result which was an
>>>>>>> opaque handle to the underlying key data, inaccessible to the script code.
>>>>>>>
>>>>>>> ?
>>>>>>
>>>>>> re: throwing an error: Yes, I think that's the intent, but that
>>>>>> means normative text should be added to the appropriate places to
>>>>>> clarify the handling of unsupported parameters/algorithms/modes
>>>>>> and raising the right exceptions. But yes, I think that's correct.
>>>>>>
>>>>>> re: every phase: I think that behaviour is going to be dictated by
>>>>>> how the algorithms are defined (what their inputs are, what their
>>>>>> outputs are). I think for the algorithms documented by this
>>>>>> WG/endorsed by the W3C, we'll want to reach consensus on what the
>>>>>> right form of each output should be. Should it be data, a single Key, multiple Keys, etc.
>>>>>> If an implementation does do something vendor-specific, then it's
>>>>>> up to that implementation to describe the outputs and their behaviour.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
Received on Tuesday, 28 August 2012 19:34:46 UTC