Re: Review of draft-ietf-httpbis-message-signatures-13 from Justin Richer on 2022-11-16 (ietf-http-wg@w3.org from October to December 2022)

From: Justin Richer <jricher@mit.edu>
Date: Wed, 16 Nov 2022 18:14:48 +0000
To: Kyle Rose <krose@krose.org>
CC: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <AB0EB278-2E1A-44D3-B720-9210754A4539@mit.edu>

Hi Kyle, thanks for the review! Comments and responses inline below.

On Nov 14, 2022, at 5:29 PM, Kyle Rose <krose@krose.org<mailto:krose@krose.org>> wrote:

# Review of draft-ietf-httpbis-message-signatures-13

I like the mechanism this draft describes and intend to make use of it to enable API gateways with end-to-end authentication. I'm glad this evolved beyond the initial focus on requests from ~4 years ago: at the time I was looking at technologies like OSCORE, and so when I found draft-rundgren-signed-http-requests I was immediately intrigued but wondered why it was incomplete in a fairly obvious way. I'm glad you guys thought so, too. :-)

First, I wanted to clear up some history: several drafts were taken as input considerations to this work, most notably draft-cavage-http-signatures, draft-ietf-oauth-signed-http-request, and draft-yasskin-http-origin-signed-responses. While we looked at draft-rundgren-signed-http-requests at some point, we did not base the HTTP Message Signatures draft on this document in any way. In fact, that draft repeats many of the mistakes of earlier work that we were explicitly trying to avoid.

I should note that I have followed none of the subsequent discussion or development of the draft since then, so apologies in advance if I reopen issues that have been closed via WG consensus.

2. HTTP Message Components:

* "The context MUST be consistent across all components." I'm not sure what it means for a context to be consistent across components. From looking at 7.4.3, it seems like the context is defined as whatever information the 2+ parties to a message share and agree to regard as in-bounds input to a message signature. I'm finding it hard to explain the difficulty I'm having interpreting this statement, so maybe one way to think about it is by asking the question, "What is an example of a context that is *inconsistent* across components?"

The context is actually just the context of the party either creating or verifying the signature, not something that they agree on. Let’s say, as a strawman example, the verifier gets a signature that includes “@query” and “@query-param” in the signature, for some weird reason. Obviously, these values should be derived from the same data source. However, there are some web app frameworks that do weird things like override values in a pre-parsed query parameters map available to the application. In this case, if a naive developer signs a @query-param that ends up coming from the app framework instead of the request, or gets overridden by it, then the verifier is using two different contexts for these components.

Ultimately, what this text means is that your context needs to be consistent and well-defined when you’re pulling your component values out, and you shouldn’t change your context in the middle of processing a signature.

* "Within a single list of covered components, each component identifier MUST occur only once." Is there a good reason for this? I mean, it's pointless extra computation, but restricting it means adding complexity around eliminating duplicate identifiers that require more than a simple string compare (given the reordering of parameters).

There’s no additional computation needed here, really, but it helps define the properties of a well-defined signature. What would be the use case of allowing a duplicate identifier? You’d get the same value in the signature twice, since each component identifier needs to resolve to exactly one value.

2.1. HTTP Fields:

* "Note that some HTTP fields, such as Set-Cookie [COOKIE], do not follow a syntax that allows for combination of field values in this manner such that the combined output is unambiguous from multiple inputs. However, the canonicalized component value is never parsed by the message signature process, merely used as part of the signature base in Section 2.5." While the canonicalized value is never parsed, it is critical that it be 1:1 with semantically distinct original values. That is, wherever two bit-distinct input representations are considered equivalent, the canonicalized values must be identical; *AND* the inverse, i.e., two input representations *not* considered equivalent must be transformed by canonicalization into bit-distinct values.¹ The reason for being precise here is that it must not be possible for two semantically-distinct sequences of Set-Cookie fields to be transformed by canonicalization into the same value, or a signature may be regarded as valid for an unintended message. If that is within the bounds of acceptable ambiguity, the safety or risks of doing so within this domain must be explained. (I can't tell whether the discussion in section 7.5.6 is sufficient to cover this.)

¹Another way of saying this is that it is fine for a canonicalization transform to lose information from the original representation so long as that information is not semantically relevant: removal of all semantically irrelevant information might be considered a core characteristic of a canonicalization transform. I am close to arguing that determining what should and should not be semantically relevant to an HTTP stack (that is, where the line between semantic and merely syntactic differences lies) is the most difficult problem posed by this draft, and is properly a problem for the entire HTTP ecosystem that deserves its own treatment. However, I recognize that the draft is intentionally leaving identification of such potential ambiguities up to individual users of this scheme while providing tools to disambiguate in such cases, and I regard that as a reasonable approach.

This non-normative note is merely to point out that sometimes things are getting weird, as has been pointed out in another reply on this thread. Set-Cookie is particularly problematic as it can’t be treated as a list-value like other headers and also gets sent multiple times, usually. That’s what the byte stream “bs” flag is for, it gives you an extra layer of armoring for these values.

Ultimately it’s up to individual applications of this work to figure out how best to handle the things that they need to sign, and it’s up to this draft to point out, to application developers and architects, what is important to figure out for the application to consider. This document doesn’t pick which fields to sign or tell you how to mark them (do you use “sf”? Do you use “bs”?), but gives you the tools to make that decision.

* The X-Empty-Header canonicalization example is particularly confusing. I recommend changing the way you represent the values in this example (e.g., showing the encoding in octets) to make clear the actual transformation. In general, something like the output of hexdump might be a better way to encode example canonicalized values in documentation such as this, even if only in an appendix. For example:

```
00000000 61 3d 31 2c 20 20 20 20 62 3d 32 3b 78 3d 31 3b |a=1, b=2;x=1;|
00000010 79 3d 32 2c 20 20 20 63 3d 28 61 20 20 20 62 20 |y=2, c=(a b |
00000020 20 20 63 29 | c)|
```

The value is an empty string, there’s no real canonicalization here. The confusing bit comes from the document tooling stripping off trailing spaces, where the trailing space is in fact required in the signature base string. A hex dump for that one example might be better than the line wrap hack that’s in there now.

2.4. Request-Response Signature Binding:

* If a request lacks a Content-Digest, there appears to be no way to cryptographically tie a response to the body of an unsigned request. I can definitely see use cases for unsigned requests with signed responses.

I do not understand the point here. You can add anything from the request to the response signature using “req” flags.

5.1. The Accept-Signature Field:

* "The requested signature..." should probably be "The signature request", as the request for a signature (rather than the signature) is the thing the client is indicating.

That’s probably better wording — I’ll take a look, thank you!

7. Security considerations:

* This section is incredibly thorough and well-written, and is a credit to the authors and contributors.

Thank you!

— Justin

Received on Wednesday, 16 November 2022 18:15:12 UTC