Re: Working Group Last Call: Encrypted Content-Encoding for HTTP from Martin Thomson on 2016-06-22 (ietf-http-wg@w3.org from April to June 2016)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Wed, 22 Jun 2016 13:47:41 +1000
To: Magnus Westerlund <magnus.westerlund@ericsson.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnXeY6DADk_fw-ZQqJCC4VGobMwT-OhaKpDr=SmsdUDgxQ@mail.gmail.com>
Thanks Magnus,

I've created a PR that I hope addresses your comments:
  https://github.com/httpwg/http-extensions/pull/202

More detail inline.

On 22 June 2016 at 00:41, Magnus Westerlund
<magnus.westerlund@ericsson.com> wrote:
> 1. Section 2:
>    The record size defaults to 4096 octets, but
>    can be changed using the "rs" parameter on the Encryption header
>    field.
>
> I think this default value is quite small. If one want to do random access
> the record boundaries becomes a question of the need for random access into
> the resource and the need for streaming data that can be released from the
> decryption and integrity verification at record boundaries. I think the text
> could be more clear on the motivation and trade-offs here.
>
> I also think there need to be discussion of the case where a single record
> with actual data is all that is needed.

Yes, the discussion on the trade-off needs to be had.  I added a
couple of paragraphs on it.  Obviously, there are lots of tradeoffs
here.

>
> 2. Section 2:
>
>
>           +------+         input of between rs-65537
>           | data |            and rs-2 octets
>           +------+      (one fewer for the last record)
>               |
>
> First I interpreted this figure part as a limitation in RS sizes. I didn't
> graso that it was RS minus number of padding bytes (2-65537) of data in this
> particular record that was the intention. I think it could benefit from a
> clarification of that it is "rs minus padding size (2-65537) number of
> bytes".

I've taken your input and tweaked the diagram and ensuing text in
light of that.  I think that it's clearer now. Let me know if you
think it could be improved further.

>
> 3. Section 2:
>
>    A sequence of full-sized records can be truncated to produce a
>    shorter sequence of records with valid authentication tags.  To
>    prevent an attacker from truncating a stream, an encoder MUST append
>    a record that contains only padding and is smaller than the full
>    record size if the final record ends on a record boundary.  A
>    receiver MUST treat the stream as failed due to truncation if the
>    final record is the full record size.
>
> This is clear on truncation at the end of the resource. However, it fails to
> describe how one detect and handle reordering or truncation from front. To
> my understanding it is the nonce derivation and the need for correctly
> knowing the sequence number of a record that prevents that attack. So that
> failure in decrypting and verifying a record may depend on reordering or
> truncation from front. Then that it may also depend on data corruption or
> other reasons is another matter.

I have added a comment explaining that the nonce construction prevents
removal and reordering of records.  Not sure if that is what you were
looking for.

> 4. Section 3.1:
>
>    rs:  The "rs" parameter contains a positive decimal integer that
>       describes the record size in octets.  This value MUST be greater
>       than 1.  If the "rs" parameter is absent, the record size defaults
>       to 4096 octets.
>
> This specifies no upper limit for the value of RS. Can I use a value larger
> than a 32-bit integer? I see there is a point of providing the implementors
> a guidance on what values may occur here. It is also clearly dependent on
> the algorithms security properties, which may introduce limitations. To my
> knowledge the limit for AES-GCM is 2^39-256 bytes to maintain the security
> properties.

Ahh yes, a good point.  I will add a note on limits.  Note that RFC
5116 says 2^36-31 per record.

The IND-CPA limits are block-based, which makes this harder, but I now
have a citation that I can use.


> 5. Section 3.2:
>
> Several of the abbreviations such as PRK and CEK are not explained.
>
>
> 6. Section 3.2:
>
>   CEK = HMAC-SHA-256(PRK, cek_info || 0x01)
>
> What is the 0x01 concatenated to the cek_info before the hashing? I would
> guess some sequence number for avoiding key reuse, but it is not clear that
> it is needed due to different inputs in the first and second step.

This is a reduction of what HKDF does.  The complete HKDF is:

   T(0) = empty string (zero length)
   T(1) = HMAC-Hash(PRK, T(0) | info | 0x01)
   T(2) = HMAC-Hash(PRK, T(1) | info | 0x02)
   T(3) = HMAC-Hash(PRK, T(2) | info | 0x03)

   T = T(1) | T(2) | T(3) | ... | T(N)
   OKM = first L octets of T

I've just included what this reduces to.

> 7. Section 4 and 4.1:
>
>    aesgcm:  The "aesgcm" parameter contains the base64url-encoded octets
>       [RFC7515] of the input keying material.
>
> I don't understand why this parameter is called "aesgcm". To me the only
> requirements on the IKM is that is is at least 16 bytes and have been
> bas64URL encoded when put into the header. So what is the relation to
> AESGCM?

This just marks the IKM as being used for this content-encoding
scheme.  That way, there's less chance that the key is used for some
other purpose.  Of course, with the info parameter on key expansion,
it's not strictly necessary, but I didn't think that a new name would
be that much of an improvement (I guess we could call it "ikm", but
it's not much clearer then).

> 8. Section 5.3:
>
> Section 3 says:
>
>    Encryption header field values with multiple instances of the same
>    parameter name are invalid.
>
> The example in 5.3 is:
>
>    PUT /thing HTTP/1.1
>    Host: storage.example.com
>    Content-Type: application/http
>    Content-Encoding: aesgcm, aesgcm
>    Content-Length: 1235
>    Encryption: keyid="mailto:me@example.com";
>                salt="NfzOeuV5USPRA-n_9s1Lag",
>                keyid="http://example.org/bob/keys/123";
>                salt="bDMSGoc2uobK_IhavSHsHA"; rs=1200
>
> Isn't this example violating the rules in Section 3 for the Encryption
> header?

You will observe that there are two "values" here, separate by the
comma.  Parameters are separated by the semi-colon.

Happy to take suggestions on how to make this clearer, HTTP header
fields are sometimes confusing.

> 9. Section 5.4 and 2:
>
> I find nothing in Section 2 that indicates that there are requirement of
> supporting a mode of a single record of encryption. My reading indicates
> that the minimal usage is possibly first a single record followed by a
> padded stop record. It is also not clear that the "rs" parameter is
> optional.

The minimal usage is a single record, with rs=<longer than the input + padding>.

The doc has: "If the "rs" parameter is absent, the record size
defaults to 4096 octets."  Added "The "rs" parameter is optional." to
this.  Is that enough?

> 10. Section 3:
>
>    The Encryption header MAY be omitted if the sender does not intend
>    for the immediate recipient to be able to decrypt the payload body.
>    Alternatively, the Encryption header field MAY be omitted if the
>    sender intends for the recipient to acquire the header field by other
>    means.
>
> I wonder about this. If the Encryption header is omitted how is the
> receiving agent knowing how to later match the resource with the correct
> key-management data. URI and etag? I mean if one include the key-id for the
> received resource then one know that one has the right key in relation to
> the resource version one have stored. It might even be the case that a later
> version of the resource is protected using the same key but with a different
> salt.

What motivated this was the out-of-band encoding, where the Encryption
header field is sent separately.  However, over time, I think that
we've concluded that the content-encoding is signaled at the same
place as the Encryption header field (see below).  I think that we can
remove this paragraph.

Originally:
200 OK
Content-Encoding: out-of-band

{ ... URLs and decryption keys ... }

More recent thinking:
200 OK
Content-Encoding: out-of-band, aesgcm
Encryption: salt=..., rs=...

{ ... URLs ... }
Received on Wednesday, 22 June 2016 03:48:10 UTC