Re: Secdir last call review of draft-ietf-httpbis-message-signatures-16

Hi Daniel,

I’ve submitted a PR that addresses the editorial items. Particularly the introduction contains more information to hopefully situation this specification in an application stack.

Please review the new text and let us know what you think: https://github.com/httpwg/http-extensions/pull/2459


 — Justin

On Mar 1, 2023, at 4:03 PM, Justin Richer <jricher@mit.edu> wrote:

Hi Daniel, thank you for your thorough review. Responses inline below.


On Feb 27, 2023, at 5:19 PM, Daniel Migault via Datatracker <noreply@ietf.org> wrote:

Reviewer: Daniel Migault
Review result: Ready

Reviewer: Daniel Migault
Review result: Ready

I have reviewed this document as part of the security directorate's ongoing
effort to review all IETF documents being processed by the IESG. These
comments were written primarily for the benefit of the security area
directors.
Document editors and WG chairs should treat these comments just like any other

Most of the document relies in the specification of the component to be signed.
I have not seen anything suspect but I am far from having the appropriate HTTP
knowledge to say there is no security flaw in the process. To be clear I am not
trying to raise any suspicion there, but if additional security review is
needed, it is, in my opinion,  where the security focus should be put. I also
see that the document has been reviewed by security people with HTTP knowledge,
so I am confident there is no need to have such security concerns.

The document is pretty clear and is well written. Thank you writing it so
clearly.

Yours,
Daniel

Some comments in line:

1.  Introduction

  Message integrity and authenticity are security properties that are
  critical to the secure operation of many HTTP applications.
  Application developers typically rely on the transport layer to
  provide these properties, by operating their application over [TLS].
  However, TLS only guarantees these properties over a single TLS
  connection, and the path between client and application may be
  composed of multiple independent TLS connections (for example, if the
  application is hosted behind a TLS-terminating gateway or if the
  client is behind a TLS Inspection appliance).  In such cases, TLS
  cannot guarantee end-to-end message integrity or authenticity between
  the client and application.
 Additionally, some operating
  environments present obstacles that make it impractical to use TLS,
  or to use features necessary to provide message authenticity.
<mglt>
Maybe we need here to explain why it is impractical. Are you thinking of
signing a component inside the HTTP message. If so, I would say that is a much
stronger reason to have a dedicated mechanisms for HTTP. </mglt>

OK, we will look into expanding this and probably pulling some text from other parts of the introduction.

  Furthermore, some applications require the binding of an application-
  level key to the HTTP message, separate from any TLS certificates in
  use.
<mglt>
I do see TLS as being application-level, so maybe adding beyond transport may
be clearer. Currently we also mention why we need a mechanism that is upper
than TLS, but maybe we should also explain why we cannot simply rely on object
security like JOSE mechanisms. If it does not open doors to controversy, it
might good to close that door. </mglt>

That’s a good point — the goal of this spec is to live right at the HTTP layer, and so we can call that out.

Consequently, while TLS can meet message integrity and
  authenticity needs for many HTTP-based applications, it is not a
  universal solution.

1.2.  Requirements

  HTTP applications may be running in environments that do not provide
  complete access to or control over HTTP messages (such as a web
  browser's JavaScript environment), or may be using libraries that
  abstract away the details of the protocol (such as the Java
  HTTPClient library (https://openjdk.java.net/groups/net/httpclient/

  intro.html)).  These applications need to be able to generate and
  verify signatures despite incomplete knowledge of the HTTP message.

<mglt>
My personal opinion is that this text is much more convincing than the one of
the introduction. </mglt>

Noted, we’ll try to pull some of this up.


1.4.  Application of HTTP Message Signatures

  *  A means of determining that a given key and algorithm presented in
     the request are appropriate for the request being made.  For
     example, a server expecting only ECDSA signatures should know to
     reject any RSA signatures, or a server expecting asymmetric
     cryptography should know to reject any symmetric cryptography.
<mglt>
The way I am reading this sentence is that the response is signed by the server
and checked by the client. Though I understand the server may also implement an
HTTP client, I am surprised to see the sever rejects what I think are HTTP
responses. I am wondering if I am missing something or if server is used in a
more generic sense as "HTTP entity" and could be a client or a server. In any
case this is a nit. </mglt>

This really shouldn’t say “request”, but instead “message” — the spec isn’t specific to either requests or responses and can be used on either (or both).


  When choosing these parameters, an application of HTTP message
  signatures has to ensure that the verifier will have access to all
  required information needed to re-create the signature base.  For
  example, a server behind a reverse proxy would need to know the
  original request URI to make use of the derived component @target-
  uri, even though the apparent target URI would be changed by the
  reverse proxy (see also Section 7.4.3).  Additionally, an application
  using signatures in responses would need to ensure that clients
  receiving signed responses have access to all the signed portions of
  the message, including any portions of the request that were signed
  by the server using the related-response parameter.

<mglt>
I do think that it is the most difficult part of the protocol, and to make it
even harder, I am wondering why there is no normative language with a serie of
MUST. This is mostly for my curiosity. </mglt>

I agree that this is the hardest part of applying this protocol to real situations. The truth of it is, there is no universal set of components that we can mandate to be signed at all times for all applications. Some applications are going to really need to protect the full URL, some won’t be able to because the path is going to the mucked up by some piece of infrastructure. Some are going to want to protect all headers and reject anything with any unsigned component, some are going to have specific headers injected or removed as a matter of course. This is why the requirement is for an application of this specification to choose required components — which isn’t easy to do, but hopefully the guidance here helps out.

Ultimately, this specification is a tool that’s designed to live at the HTTP layer, and while it’s important that it get used well, the best we could hope to provide is guidance about what to choose.


2.  HTTP Message Components

  In order to allow signers and verifiers to establish which components
  are covered by a signature, this document defines component
  identifiers for components covered by an HTTP Message Signature, a
  set of rules for deriving and canonicalizing the values associated
  with these component identifiers from the HTTP Message, and the means
  for combining these canonicalized values into a signature base.

  The signature context for deriving these values MUST be accessible to
  both the signer and the verifier of the message.  The context MUST be
  the same across all components in a given signature.  For example, it
  would be an error to use a the raw query string for the @query
  derived component but combined query and form parameters for the
  @query-param derived component.  For more considerations of the
  message component context, see Section 7.4.3.

  A component identifier is composed of a component name and any
  parameters associated with that name.  Each component name is either
  an HTTP field name (Section 2.1) or a registered derived component
  name (Section 2.2).  The possible parameters for a component
  identifier are dependent on the component identifier, and the HTTP
  Signture
<mglt> Sin"a"ture </mglt>

Fixed, thanks!


Component Parameters registry cataloging all possible
  parameters is defined in Section 6.5.

  Within a single list of covered components, each component identifier
  MUST occur only once.  One component identifier is distinct from
  another if either the component name or its parameters differ.
<mglt>
English is not my native language, but I am wondering if either or includes
both. I am assuming component names and parameters can be different. from
cambridge dictionary: used to refer to a situation in which there is a choice
between two different plans of action, but both together are not possible:
</mglt>

Yes, it is an inclusive “or” — I’m not sure if that needs to be made more explicit but it wouldn’t hurt.


2.1.  HTTP Fields

  The component name for an HTTP field is the lowercased form of its
  field name as defined in Section 5.1 of [HTTP].  While HTTP field
  names are case-insensitive, implementations MUST use lowercased field
  names (e.g., content-type, date, etag) when using them as component
  names.

  The component value for an HTTP field is the field value for the
  named field as defined in Section 5.5 of [HTTP].  The field value
  MUST be taken from the named header field of the target message
  unless this behavior is overridden by additional parameters and
  rules, such as the req and tr flags, below.
  Unless overridden by additional parameters and rules, HTTP field
  values MUST be combined into a single value as defined in Section 5.2
  of [HTTP] to create the component value.  Specifically, HTTP fields
  sent as multiple fields MUST be combined using a single comma (",")
  and a single space (" ") between each item.  Note that intermediaries
  are allowed to combine values of HTTP fields with any amount of
  whitespace between the commas, and if this behavior is not accounted
  for by the verifier, the signature can fail since the signer and
  verifier will be see
<mglt> will see </mglt>

Got it, thank you!


a different component value in their respective
  signature bases.  For robustness, it is RECOMMENDED that signed
  messages include only a single instance of any field covered under
  the signature, particularly with the value for any list-based fields
  serialized using the algorithm below.  This approach increases the
  chances of the field value remaining untouched through
  intermediaries.
<mglt>
For my own information, I am wondering why having a signature over two
instances of the same field increases over a single instances increases the
chances of the field - one of the fields being modified on path. </mglt>

When you sign a field, you sign the combined value of that field, no matter if it comes in multiple headers or not. Intermediaries are allowed to combine fields with the same name (with some caveats), but that could potentially lead to inconsistent processing of the values for the purposes of the signature. So what we’re saying here is that when you’re sending a header with two values like:

Foo: bar
Foo: baz

Then if possible you should send it as a pre-combined single value using the combination method referenced, as:

Foo: bar, baz

This helps prevent something like this, where an intermediary combines the field without using the separating whitespace:

Foo: bar,baz

HTTP is messy enough to allow all of this behavior, so the recommendation is to send information in a way that it’s less likely — but not impossible — for it to be chopped up in transit in a way that would break a signature.


2.2.4.  Scheme

  The @scheme derived component refers to the scheme of the target URL
  of the HTTP request message.  The component value is the scheme as a
  lowercase string as defined in [HTTP], Section 4.2.  While the scheme
  itself is case-insensitive, it MUST be normalized to lowercase for
  inclusion in the signature base.

  For example, the following request message requested over plain HTTP:

  POST /path?param=value HTTP/1.1
  Host: www.example.com

  Would result in the following @scheme component value:

  http

  And the following signature base line:

  "@scheme": http

<mglt>
For my information, I am wondering how the signer can distinguish the http
scheme from https as it does not appear in the HTTP message. Since we only deal
with HTTP, I am wondering if the https scheme is not replaced by http. </mglt>

For a request (to which this applies), the signer can distinguish it because they know the URL they’re going to send the HTTP message to, including the scheme. For a verifier, it would need to know through some mechanism what the incoming URL would be in full.


2.2.5.  Request Target

<mglt>
For my information I am wondering how one can make the target unique. typically
do we include the port even when the default port is being used and thus can be
omitted ? </mglt>

The target doesn’t always need to be unique, so I might be missing what the question is here. Regardless, adding default ports doesn’t really help this, especially when in the wild the default ports are almost always removed.


7.1.2.  Use of TLS

  The use of HTTP Message Signatures does not negate the need for TLS
  or its equivalent to protect information in transit.  Message
  signatures provide message integrity over the covered message
  components but do not provide any confidentiality for the
  communication between parties.

  TLS provides such confidentiality between the TLS endpoints.  As part
  of this, TLS also protects the signature data itself from being
  captured by an attacker, which is an important step in preventing
  signature replay (Section 7.2.2).

  When TLS is used, it needs to be deployed according to the
  recommendations in [BCP195].

<mglt>
signature is only focused on authentication and TLS 1.3 always has encryption,
so the overlap remains only in the case a NULL cipher would be use. My
understanding of OSCORE is that it restricts the protection of the header and
that authentication only is permitted, this make it potentially a more
interesting protocol in term of overlap. I have the impression you are able to
achieve in term of integrity similar protection as OSCORE as the HTTP signature
can include headers fields. I am wondering it that would worth being mentioned.
</mglt>

We chose not to mention OSCORE so as not to confuse readers who wouldn’t be familiar with that. This specification can’t directly be applied to the CORE/COAP/COSE world. I believe it does have similar integrity protection properties, but I don’t think there’s a lot of value in bringing that up specifically here.


7.2.2.  Signature Replay

  Since HTTP Message Signatures allows sub-portions of the HTTP message
  to be signed, it is possible for two different HTTP messages to
  validate against the same signature.  The most extreme form of this
  would be a signature over no message components.  If such a signature
  were intercepted, it could be replayed at will by an attacker,
  attached to any HTTP message.  Even with sufficient component
  coverage, a given signature could be applied to two similar HTTP
  messages, allowing a message to be replayed by an attacker with the
  signature intact.

<mglt>I see this as a repeat of 7.2.1. </mglt>

They are similar but separate issues. 7.2.1 says “cover as much of the message as you can”, and 7.2.2 says “even if you cover everything you think of then someone could replay the signature with a similar-enough message”. This is why 7.2.2 recommends the use of signature parameters such as `nonce` for replay protection.


  To counteract these kinds of attacks, it's first important for the
  signer to cover sufficient portions of the message to differentiate
  it from other messages.  In addition, the signature can use the nonce
  signature parameter to provide a per-message unique value to allow
  the verifier to detect replay of the signature itself if a nonce
  value is repeated.  Furthermore, the signer can provide a timestamp
  for when the signature was created and a time at which the signer
  considers the signature to be expired, limiting the utility of a
  captured signature value.

  If a verifier wants to trigger a new signature from a signer, it can
  send the Accept-Signature header field with a new nonce parameter.
  An attacker that is simply replaying a signature would not be able to
  generate a new signature with the chosen nonce value.

<mglt>I do see two different problem here: 1) Do I have the signature of the
message ? and 2) Do I have the signature of the response ? I do see the first
remain related to 7.2.1 and the cookie solving 2. 1) enable caching and might
be relevant for public data - for the time that public data is valid. In both
cases there is a replay I agree, but I am wondering if there is any
recommendation regarding the use of the new nonce.</mglt>

The processing of signatures on requests and responses are going to be pretty different, especially in terms of replay protection. I would expect a cached response to return a cached signature over the response as well. The verifier (the client in this case) would need to decide if the nonce repetition was meaningful in this case. If you’re OK with a cached response you probably wouldn’t use or pay much attention to the nonce.


7.2.8.  Message Content

  As discussed in [DIGEST], the value of the Content-Digest field is
  dependent on the content encoding of the message.  If an intermediary
  changes the content encoding, the resulting Content-Digest value
  would change, which would in turn invalidate the signature.  Any
  intermediary performing such an action would need to apply a new
  signature with the updated Content-Digest field value, similar to the
  reverse proxy use case discussed in Section 4.3.

<mglt> This seems to suggest some sort of policies. For my information I am
wondering current implementations are using such policies to configure the
validator or if that is "left to the implementation" meaning we trust somehow
the signer to perform the correct operation. </mglt>

This is meant to be a warning for implementors — if you change the encoding, then you change the digest, which will change the signature. Understanding this might affect how you deploy your intermediaries in an environment using signatures and digests.


7.3.1.  Cryptography and Signature Collision

  The HTTP Message Signatures specification does not define any of its
  own cryptographic primitives, and instead relies on other
  specifications to define such elements.  If the signature algorithm
  or key used to process the signature base is vulnerable to any
  attacks, the resulting signature will also be susceptible to these
  same attacks.

  A common attack against signature systems is to force a signature
  collision, where the same signature value successfully verifies
  against multiple different inputs.  Since this specification relies
  on reconstruction of the signature base from an HTTP message, and the
  list of components signed is fixed in the signature, it is difficult
  but not impossible for an attacker to effect such a collision.  An
  attacker would need to manipulate the HTTP message and its covered
  message components in order to make the collision effective.

<mglt> Note being familiar enough with HTTP, the attack is especially an issue
if the client is able to predict the HTTP response, - and echo server is a good
example. I am wondering what the server could respond in order to differ from
the expected responses. one way to see that is to make any response
unique.</mglt>

That depends on if you’re signing the request or the response (or both), and which part’s under attack. In all cases, use of the “created” and “nonce” parameters can trivially make each signature unique and time-bound.


7.5.5.  Canonicalization Attacks

  Any ambiguity in the generation of the signature base could provide
  an attacker with leverage to substitute or break a signature on a
  message.  Some message component values, particularly HTTP field
  values, are potentially susceptible to broken implementations that
  could lead to unexpected and insecure behavior.  Naive
  implementations of this specification might implement HTTP field
  processing by taking the single value of a field and using it as the
  direct component value without processing it appropriately.

  For example, if the handling of obs-fold field values does not remove
  the internal line folding and whitespace, additional newlines could
  be introduced into the signature base by the signer, providing a
  potential place for an attacker to mount a signature collision
  (Section 7.3.1) attack.  Alternatively, if header fields that appear
  multiple times are not joined into a single string value, as is
  required by this specification, similar attacks can be mounted as a
  signed component value would show up in the signature base more than
  once and could be substituted or otherwise attacked in this way.

  To counter this, the entire field value processing algorithm needs to
  be implemented by all implementations of signers and verifiers.

<mglt>
In the canonicalization process, I would like to know if we have a simple way
to ensure that Cannonicalization( HTTP field1: HTTP_value1 ) will never ends in
Cannonicalization( HTTP fielda: HTTP_valuea \n HTTP fieldb: HTTP_valueb )

I have the impression this is the main potential source of weakness this
document may introduce. Note that I am not familiar with HTTP, so do not be
upset if the question is straight forward and obvious. I think the reason I
have this in mind is that some separators are replaced during the
canonicalization.

</mglt>


Yes, this condition is accounted for in the current requirement for derived component values:

Derived component values MUST be limited to printable characters and spaces and MUST NOT contain any newline characters. Derived component values MUST NOT start or end with whitespace characters.

Field values already can’t contain newlines, and whitespace is stripped before adding to the message.

The one holdout is covered in this issue about HTML query parameters (@query-param):

https://github.com/httpwg/http-extensions/issues/2417


At the moment, the authors are looking for advice from the community about the right way to encode these values to protect against this potential problem.

We’ll work up a pull request to address the text issues identified here. If there is any additional follow up discussion, please don’t hesitate to continue the conversation.

Thank you once again for your review,
 — Justin

Received on Friday, 3 March 2023 18:51:41 UTC