[Prev][Next][Index][Thread]

IETF mtg discussion comments




I'd like to make some comments on the technical side of what was
discussed at the IETF mtg, and repeat some points made there here,
both to clarify and add emphasis to the points and to make sure they
go into the WG email/web archive.



1.  Does pre-encryption prevent MACs from being encrypted?

First, Paul Kocher said that in order to support pre-encrypted data,
the MAC (independent of whether it is an on-the-fly computed MAC or a
pre-computed MAC) must be left outside of the encryption.  That is,
the record format will include a header, pre-encrypted data, and an
un-encrypted MAC.

This is not quite correct.

What -is- required is that the MAC must go to the end of the record
(true for SSLv3.x / PCT / now-dropped TLS-draft) and that per-record
IVs be permitted for block ciphers -- or have the IV for the next
record be the last data block, excluding the MAC block(s).

Consider first the case of a stream cipher.  To have encrypted MACs,
we simply run the stream cipher to generate enough output to encrypt
(xor with) the MAC and store that with the "compiled" or pre-encrypted
version of the data.  When we've sent the encrypted data and it's time
to send the MAC, we can grab the saved stream cipher output and xor it
with the MAC prior to transmission.

The case of a block cipher in CBC mode is only slightly more tricky.
Assume that we have per-record IVs, and for now also assume that the
length of the data is a multiple of the cipher's block size.  If we
want to encrypt the MAC, we only need to use the last encrypted data
block as the IV to encrypt the MAC value -- this value is certainly
available, since we need to send it.  Note, however, if the next
record did not have independently specifiable IVs, we are stuck wrt
being able to pre-encrypt it.  Now, let's drop the block size multiple
restriction; we -could- save unencrypted the last partial data block
and do the last data block + IV encryption on-the-fly.  This, however,
is not a good idea -- the MAC data in this last partial data block may
be as little as a byte, so the input to the block cipher over many
keys is changing in a tightly prescribed manner, and certainly leaks
more partial information about the key than otherwise.  (This analysis
ignores the fact that, for HTTP connections at least, much of the
initial and final blocks are guessable.)

The correct thing to do is to pad out the data to a full block when we
pre-encrypt, and then encrypt the MAC as a continuation of the CBC in
a separate block.  Note that it really doesn't matter that we are
using what amounts to a fixed IV -- for that document -- for many
different MAC values (over many sessions), since the purpose of IVs is
to prevent precomputation attacks w/ known plaintext inputs, and our
MACs being keyed with a secret value should have a random distribution
(the HMAC compression function as a random function assumption/model)
will not require/benefit from the IV anyway.  (Encrypting the MAC in
CBC mode along with the data increases client-side transparency, but
is not really required.)



2.  Constancy of Communication Channel Security

During the meeting, I stated that it is a good design principle to
have the protocol design not allow the security properties
(cryptographic functions used, key size, how keys are exchanged,
whether keys are escrowed, etc) of the communication channel to change
-- or require some way to notify the user (application) of any such
change.

If an attacker can, for example, force the communicating parties to
renegotiate the cryptographic algorithms used, some loss of security
may ensue.  Consider the following scenario: one party is a secure
server with hardware cryptographic assist for strong bulk crypto or a
strong-but-slow software crypto implementation, and the system
dynamically determines the availability of the hardware assist or
availability of cycles for the strong-but-slow algorithm (versus a
weaker-but-faster crypto function implemented in software or in some
different hardware module) based on the server load.  The client
connects to the server, and, the server being lightly loaded, gains
access to strong crypto functions in the cipher suite negotiations.
The client software checks the cryptographic strength at
TLS-connection-establishment time, and believes that the channel to be
very secure.  If an attacker can force the two parties to renegotiate
the cryptographic algorithms after loading the server with many
encrypted connections, then the connection may get renegotiated to use
the weaker (but faster) cipher.  If the API does not provide a
mechanism to notify the application of the change in channel security,
this is a potential avenue of attack.



3.  Symmetry Breaking

The above renegotiation scenario also argues against having protocol
messages that are -too- symmetric, since it may help the attacker to
force a rengotiation.  Consider the case where the renegotiation
handshake messages are symmetric; that is, the response to a
rengotiation request looks the same as the original renegotiation
request (and etc for the rest of the messages).  

With this design, an active attacker may force the renegotiation by
injecting the appropriate renegotiation request message from A to B,
have B's renegotiation response message (which, in a symmetric design,
is identical to a request message) act as the renegotiation request to
A, and then subsequently suppressing the otherwise-identical response
message from A to B.  The (active) attack has just forced a
renegotiation, even if the attacker could not break the cryptographic
functions used.

The protocol messages -may- be largely symmetric, of course, but
should at least include an indicator as to whether the sender believes
it initiated the renegotiation (both sides may have initiated).  By
having this indication be included in the final handshake hash which
confirms that the two sides saw the same (renegotiation) handshake
messages.

Having the renegotiation request/response messages be encrypted/MACed
using the current key also prevents this attack, but since one of the
reasons that we may wish to renegotiate fresh symmetric encryption
keys is because we've used the cipher for too long and believe that
too much partial information about the key has been leaked (or
analogous idea for the MAC algorithm's key), the protection afforded
is not necessarily that good compared to just carefully breaking
symmetry.



4.  Channel Security Constancy and Pre-encryption

It may seem that the Channel Security Constancy principle would argue
against the use of pre-encryption.  (This was what Eric Rescorla
referred to during the meeting as Bennet's Law, but I bet the idea
predates me.)

In some ways, it does.  If we extend the cipher suite specification to
include a specification for pre-encryption cipher, however, then
there's no problem even when the API does not permit asynchronous user
(app) notification of channel property changes.  Furthermore, the
point of pre-encryption is actually two fold: first, the more widely
recognized desire to increase (web/ftp/etc) server performance, and
second, the security of multiply retransmitted data.

The first is pretty obvious.  If we can pre-encrypt data (via a sort
of compilation process) on servers, the servers can save on cycles
that would otherwise have to be used to do on-the-fly encryption.

The second is slightly subtler.  For oft retransmitted data,
re-encrypting them under different keys, especially weak, 40-bit keys,
for many transmissions provides attackers with more partial
information about the plaintext.  If there is only one encrypted
version of the data that is always re-sent, less partial information
about the data is leaked.

Notice that while the key management messages for sending the fixed
pre-encryption keys to the various clients also will be sending a
fixed datum under several different keys to many recipients, this is
less of a problem since the key in the key management protocol
messages may be sent protected by a stronger encryption function --
these messages are not used to transfer user data (export).
Furthermore, pre-encryption keys are not predictable values, unlike
most Web-related transfers (e.g., "HTTP/1.0 200 OK" as the first
line), so do not provide known plaintext for the attacker to
cryptanalyze the key, unlike the rest of the protocol message (and
other protocol messages).  (The German nursery rhyme weakness.
[Enigma cryptanalysis history.])



5.  Modularization and Layering, and Pre-encryption

As a general purpose cryptographic protocol, TLS's embodiment in an
implementation should make it easy to use but difficult to misuse.
Furthermore, the implementation should be cleanly engineered, so that
security reviews are feasible.  Does pre-encryption violate
"layering", a sound engineering principle?

As I've mentioned in other email to this list, transforming a
plaintext file into a pre-encrypted file may be thought of as a
compilation process.  This translator is intimately bound with the
record layer, since it must know the record format as well as the
ciphers to use, and the key management layer simply passes the
compiled file down to the record layer uninterpreted except for
extracting (and sending) the pre-encryption key and verifying that the
cipher spec matches.

I don't view this as a terrible layering violation -- it's nothing
more than viewing the data transmission process as incorporating a
just-in-time data compiler that caches, with the key management layer
specifying a fixed encryption key for the file.



6.  Pre-MAC-ing Data

Unlike pre-encryption, pre-MAC-ing data entails a slightly increased
security risk that must be traded off against the performance
benefits.  The security risk is exactly which of two cryptographic
assumption must we make.  Without pre-MAC-ing, vanilla HMAC is used.
This requires a hash function that is collision intractable when used
with a secret key.  The suggested pre-MAC function,

	PreMAC(k_1,k_2,m) = h(k_1;h(k_2;m))

where ";" denotes concatenation, was arrived at after consulting with
a person that is very knowledgeable about the HMAC design; it is very
similar to the vanilla HMAC function.  k_1 and k_2 are unrelated here
(but also padded), whereas in HMAC they are the same value with
different padding to extend it to a full block size (the size of the
input for the compression function).  The point of the pre-MAC
function is that k_2 is associated with the plaintext, so may be stored
with h(k_2;m) pre-computed.

The standard HMAC security analysis holds for "outside" attackers; the
only scenario that we need to worry about is the case of one
authorized recipient of the data (Alice) -- who would then know k_2 --
spoofing messages to another authorized recipient (Bob).  To break the
PreMAC function, she needs to find a new m' and a' such that
a'=PreMAC(k_1,k_2',m'), where k_1' is the new per-channel key that Bob
shares with the server and is unknown to Alice.  The only difference
between this and vanilla HMAC is the knowledge of k_2 -- it permits
Alice to try to find unkeyed collisions, an m' where
h(k_2;m')=h(k_2;m), as a new way to compute a new m' and a'.  So,
whereas for HMAC the security is (\epsilon_f+\epsilon_F,q,t,L)-secure,
where the compression function is (\epsilon_f,q,t,b)-secure, and the
keyed iterated hash is (\epsilon_F,q,t,L)-weakly collision resistant,
PreMAC is (\epsilon_f+\epsilon_h,q,t,L)-secure, where the iterated
hash function is (\epsilon_h,t) second pre-image resistant, i.e., the
probability of finding a second pre-image is \epsilon_h after running
for time t (because there's no key, no external oracle is needed, and
there's no need in the model to count queries, since the cost are
included in the run-time t already).

To see why this is the case for the "inside" attacker, assume the
existence of an algorithm for attacking PreMAC, and use it as a
subroutine to construct an algorithm to attack the compression
function, just like in the HMAC analysis.

Note that it is necessarily the case that

	\epsilon_h >= \epsilon_F

So for "inside" attackers, this difference is her/his advantage over
the "outside" attacker, for whom PreMAC is still
(\epsilon_f+\epsilon_F,q,t,L)-secure.

Compare the "insider" (\epsilon_f+\epsilon_h,q,t,L) security of PreMAC
with the other alternative that Paul mentioned,

	PreMAC'(k,m)=HMAC(k,h(m))

For PreMAC', there is no distinction between "inside" and "outside"
attackers -- for both cases the PreMAC' is
(2\epsilon_f+\epsilon_h,q,t,L)-secure, which is not as good as PreMAC.
(Since PreMAC'(k,m)=HMAC(k,h(m))=h(k;p1;h(k;p2;h(m))), we don't get a
\epsilon_F term because h(k;p2;h(m)) is just one compression function
application on top of the h(m) computation.)



7.  Cryptographic "Linking" of the Password with the Master Secret

A claim was made that S/Key may be used on top of TLS to independently
give the same or similar security as the integrated password mechanism
that Dan Simon proposed, and I said that this was not true: the
cryptographic "linking" of the authenticator (the password or S/Key
response) with the exchanged master secret is better with Dan's
hash-based mechanism.

The reason for this, as Dan had pointed out, is that the MAC keys is
going to be (much) longer than the encryption key in the foreseeable
future: if we simply transmitted the password (or S/Key authenticator)
over the encrypted channel, an attacker only has to break the (weak)
encryption in order to impersonate as the user in subsequent sessions;
if we used the hash-based scheme, however, the attacker would have to
break the hash function, and find a consistent preimage from observing
valid authentication hashes (violating the preimage resistance
assumption).  While key sizes only gives an upper limit on the
security of the cryptographic functions and aren't directly
comparable, it is likely that the keyed hashes will be harder to break
than the weak encryption, and thus the linking of the authenticator in
the response hash will be stronger than simply transmitting the
authenticator under the (weak) encryption function.

Whether providing such a feature is desirable or not -- esp wrt the
politics of encouraging public key (I do not own stock in Security
Dynamics...) is a totally different question.



8.  Features are Optional

Pre-encryption, pre-MAC-ing, ans password authentication are all
independent options that are (typically) server configuration choices.
If you don't want them, clients should be able to leave these out of
the crypto function negotiation.  (Though pre-encryption, in my
opinion, should always be used since it -helps- security when the
cipher strength matches [pre-encryption uses the same cipher as normal
encryption].)



9.  Misc

One person asked me after the IETF meeting what my role is --
basically saying that there's a perception that I am a "hired gun" for
Microsoft.  I just want to make my reply here, to dispell any
misconceptions (certainly when in class, if one student asks a
question most of the time others are confused about the same thing;
I'd imagine there may be others with the same misconception).

First, I used to work for Microsoft, but only for a little over a
year.  They'd been good to me, but I wanted to go into academia to do
research and teach.  I do not currently work for them, though I may do
so again (on a temporary basis, perhaps later in the summer break).
I've also consulted for Microsoft, but not on the now-dropped TLS
draft.  (Though I wouldn't mind if somebody paid for my time.... :)

In any case, I am an assistant professor at UCSD -- an academic, with
every intention of staying so.  I believe that my opinions on security
and cryptography to be *unbiased* (on the other hand, my opinions
about where the industry is doing, my favorite color, etc, are of
course biased by what I've seen) -- especially since the cryptographic
analysis is just math/logic that anybody can check for themselves!


This is too long already....


-bsy

--------
Bennet S. Yee		Phone: +1 619 534 4614	    Email: bsy@cs.ucsd.edu

Web:	http://www-cse.ucsd.edu/users/bsy/
USPS:	Dept of Comp Sci and Eng, 0114, UC San Diego, La Jolla, CA 92093-0114