Re: Review of draft-ietf-httpbis-message-signatures-13 from Kyle Rose on 2022-11-18 (ietf-http-wg@w3.org from October to December 2022)

From: Kyle Rose <krose@krose.org>
Date: Fri, 18 Nov 2022 09:43:43 -0500
To: Justin Richer <jricher@mit.edu>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAJU8_nUxyMKhrwDAN5j6AoUeXMzQYbzqStcfZKiqHt+owe5MVw@mail.gmail.com>
On Wed, Nov 16, 2022 at 1:14 PM Justin Richer <jricher@mit.edu> wrote:

> First, I wanted to clear up some history: several drafts were taken as
> input considerations to this work, most notably
> draft-cavage-http-signatures, draft-ietf-oauth-signed-http-request,
> and draft-yasskin-http-origin-signed-responses. While we looked
> at draft-rundgren-signed-http-requests at some point, we did not base the
> HTTP Message Signatures draft on this document in any way. In fact, that
> draft repeats many of the mistakes of earlier work that we were explicitly
> trying to avoid.
>

Good to know. At any rate, I'm glad we ended up with this draft, because it
covers all the use cases I'm interested in.

* "The context MUST be consistent across all components." I'm not sure what
> it means for a context to be consistent across components. From looking at
> 7.4.3, it seems like the context is defined as whatever information the 2+
> parties to a message share and agree to regard as in-bounds input to a
> message signature. I'm finding it hard to explain the difficulty I'm having
> interpreting this statement, so maybe one way to think about it is by
> asking the question, "What is an example of a context that is
> *inconsistent* across components?"
>
>
> The context is actually just the context of the party either creating or
> verifying the signature, not something that they agree on. Let’s say, as a
> strawman example, the verifier gets a signature that includes “@query” and
> “@query-param” in the signature, for some weird reason. Obviously, these
> values should be derived from the same data source. However, there are some
> web app frameworks that do weird things like override values in a
> pre-parsed query parameters map available to the application. In this case,
> if a naive developer signs a @query-param that ends up coming from the app
> framework instead of the request, or gets overridden by it, then the
> verifier is using two different contexts for these components.
>
> Ultimately, what this text means is that your context needs to be
> consistent and well-defined when you’re pulling your component values out,
> and you shouldn’t change your context in the middle of processing a
> signature.
>

Makes sense. I would then explicitly define what "context" and "consistent"
mean here, maybe replacing the entire second paragraph with something like:

The context for a signed HTTP message comprises the set of components
employed in creating and subsequently verifying an HTTP message signature.
For signatures to be verifiable by receivers, this context MUST be
identical across all parties to the signature (signer and all verifiers). A
context that is shared in such a way shall be regarded as "consistent".
Two notes, however:

   - You might instead choose to define "context" in a similar way in the
   glossary above.
   - "Consistent" is not used anywhere subsequently in the document, so you
   might instead just drop that definition.


* "Within a single list of covered components, each component identifier
> MUST occur only once." Is there a good reason for this? I mean, it's
> pointless extra computation, but restricting it means adding complexity
> around eliminating duplicate identifiers that require more than a simple
> string compare (given the reordering of parameters).
>
>
> There’s no additional computation needed here, really, but it helps define
> the properties of a well-defined signature. What would be the use case of
> allowing a duplicate identifier? You’d get the same value in the signature
> twice, since each component identifier needs to resolve to exactly one
> value.
>

I think I'm not getting my point across adequately. Here are the two
options:

   1. (As stated in the draft) Prohibit multiple instances of a component
   identifier. This means implementations must add checks that a component
   identifier isn't included in a signature multiple times, which would
   involve prohibiting the inclusion of both `"foo";bar;baz` and
   `"foo";baz;bar` but allowing `"foo";bar;baz` and `"foo";bar,quux`.
   2. (My proposal) Allow multiple instances of the same component
   identifier. All this does is make the signature computation and
   verification take slightly longer if a duplicate identifier is included,
   and allowing it means no complex de-duplication checks.

#1 seems like a pointless guardrail in a protocol that otherwise delegates
most of the responsibility to users to do the right thing when choosing a
signature schema.


> Ultimately it’s up to individual applications of this work to figure out
> how best to handle the things that they need to sign, and it’s up to this
> draft to point out, to application developers and architects, what is
> important to figure out for the application to consider. This document
> doesn’t pick which fields to sign or tell you how to mark them (do you use
> “sf”? Do you use “bs”?), but gives you the tools to make that decision.
>

Great. That's what I wanted to hear. (Now apply that same logic to the
duplicate component identifier issue from above.)

* The X-Empty-Header canonicalization example is particularly confusing. I
> recommend changing the way you represent the values in this example (e.g.,
> showing the encoding in octets) to make clear the actual transformation. In
> general, something like the output of hexdump might be a better way to
> encode example canonicalized values in documentation such as this, even if
> only in an appendix. For example:
>
> ```
> 00000000  61 3d 31 2c 20 20 20 20  62 3d 32 3b 78 3d 31 3b  |a=1,
>  b=2;x=1;|
> 00000010  79 3d 32 2c 20 20 20 63  3d 28 61 20 20 20 62 20  |y=2,   c=(a
> b |
> 00000020  20 20 63 29                                       |  c)|
> ```
>
>
> The value is an empty string, there’s no real canonicalization here. The
> confusing bit comes from the document tooling stripping off trailing
> spaces, where the trailing space is in fact required in the signature base
> string. A hex dump for that one example might be better than the line wrap
> hack that’s in there now.
>

In a protocol like this in which every bit sequence matters to correctness,
clarity in the specification and chosen examples is critical. I imagine
you'll get similar feedback from the security directorate if you haven't
already.

2.4. Request-Response Signature Binding:
>
> * If a request lacks a Content-Digest, there appears to be no way to
> cryptographically tie a response to the body of an unsigned request. I can
> definitely see use cases for unsigned requests with signed responses.
>
> I do not understand the point here. You can add anything from the request
> to the response signature using “req” flags.
>

I mean (unless I missed something) there's no way to refer to the content
body of a request, only to headers and request metadata. You can't refer to
(e.g.) @req.body to include the entire body in the signed data. If there's
a header that cryptographically depends on the entire request body (e.g.,
req.content-digest) you can refer to that. (I'll note in doing so you're
also relying on the stack having verified that digest against the message
content before processing the message, which may or may not be fine; I
don't know what the normative language in the HTTP ecosystem for treatment
of Content-Digest is.)

Thanks,
Kyle
Received on Friday, 18 November 2022 14:44:08 UTC