Redundant Unicode, Nonces, IVs, and Encryption Modes from Joseph Reagle on 2002-01-09 (xml-encryption@w3.org from January 2002)

From: Joseph Reagle <reagle@w3.org>
Date: Wed, 9 Jan 2002 14:05:12 -0500
To: xml-encryption@w3.org
Message-Id: <200201091905.OAA13841@tux.w3.org>
(This should've gone to the XML Encryption list)


----------  Forwarded Message  ----------

Subject: Re: WOOPS: xmlenc Call 13:00 EST 20020107
Date: Wed, 09 Jan 2002 16:01:52 +0100
From: Christian Geuer-Pollmann <geuer-pollmann@nue.et-inf.uni-siegen.de>
To: reagle@w3.org
Cc: XML Signature WG <w3c-ietf-xmldsig@w3.org>

...

> BTW: What do you think of Martin's additional concern about nonces not
> mitigating redundancy in later in the data (like Unicode italic
> characters). I thought that with most modern ciphers that if the
> beginning  is unpredictable that's sufficient for most of the message?
> (If you have  any thoughts, please respond to the list.)

I'm not an UFT/I18n expert so I did not understand the "Old italic" stuff,
 but I agree with Martin that usage of the Nonce does not increase entropy.
 It would do so if the whole message would be transformed in a single
 transform.

Imagine we would use a construct like a cryptographic Hash function or an
 RSA encryption with a modulus size of 1000000 bit (just for illustration).
 If I hash or encrypt XML data with a low entropy using such a transform, a
 nonce would help because the nonce would influence the complete
 transformation step. That sort of "nonce" is used if we hash a unix
 password in /etc/shadow (the 'salt' value is the nonce) or if we encrypt a
 message using RSA-OAEP (the random parameters are the nonce). In these
 transforms, each input bit has effect on the output bits and then the
 nonce does it's job).

But - encrypting a message in CBC is _NOT_ a single transformation: It's
 splitting the message into small chunks (the blocks) and transforming each
 block separately (OK, not absolutely separate because we use the chaining
 to brevent reordering of blocks and that stuff). So a CBC is not a single
 transform but many small transforms.

So if we use a Nonce which length is (blocklength - 1), we have the
 situation that the block containing the first plaintext octet (and the
 preceding blocklength-1 nonce octets) gets a higher entropy, but ONLY this
 block gets a higher entropy. All following blocks to not take advantage of
 this nonce. Additionally, the first plaintext octet is vulnerable to
 malicious modification.

Maybe it was not stated explicitly here:

We have two different attacks against CBC-encrypted data:

Type A: breaking the encryption and reveal the plaintext
Type B: change the underlying plaintext without breaking the encryption.

The Nonce value is not necessary for stopping Type-A attacks. This is done
 by using a newly generated IV for each message. For preventing Type-A
 attacks, the IV need not be secret, it must be only unique.

For Type-B attacks, the Nonce makes them (if properly (blocklength-1)
 choosen) more difficult, but not impossible.

If we encrypt the IV (and forget the Nonce), we create the following
 property:

If an attacker modifies a bit in the encrypted IV value, 50% of the bits in
the IV will change. This will make the same 50% of the first plaintext
 block toggle - and this will make parsing a little bit complicated because
 I almost always get parser exceptions ;-))

This is only a weak criterion for integrity protecting, but we already
 said: "If you wanna have integrity, use XML Signature".

>>  > > > - There needs to be some text about security risks associated
>>  > > >    with UTF-8. Assume that somebody knows that the encrypted
>>  > > >    text is Old Italic
>> (http://www.unicode.org/charts/PDF/U10300.pdf,  > > >    no spaces or
>> punctuation). In this case, UTF-8 uses four bytes
>
> per
>
>>  > > >    characters, and three of them are always the same, and the top
>>  > > >    two (or three if there are no numbers) bits of the last byte
>>  > > >    are also always the same.
>>  > I think that the Nonce can help quite a bit in some situations.
>>  > But I'm not really sure at all that it will help much in the
>>  > situation I have described. Let's assume the attacker knows
>>  > that most of the encrypted text (rather than all) is in Old
>>  > Italic. What you are saying is that if the non-Old Italic
>>  > text is at the start of the data, attacks are much more
>>  > difficult than if the non-Old Italic text is in the
>>  > middle or at the end. This may indeed be true for attacks
>>  > that are based on looking at the start of the encoded sequence.
>>  > But there are most probably also attacks that can look at
>>  > any part of the data and try to find out something about it.
>>  > In other terms, the nonce doesn't really increase the entropy,
>>  > it just conceals it.
>>  > Of course, I'm not an expert here, but I'd rather be sure.
>>
>> Ok, I will defer this to the crypto experts.
>
> I'm looking forward to the discussion.

-------------------------------------------------------

-- 

Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature/
W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
Received on Wednesday, 9 January 2002 14:05:19 UTC