W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2020

Re: Structured Fields: whitespace in binary content

From: Nick Harper <nharper@google.com>
Date: Wed, 28 Oct 2020 10:59:36 -0700
Message-ID: <CACdeXiK4hVgM6jJSoUEz3_Y8ODaEsQP5W4yJq+rMqV-QWeTymA@mail.gmail.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: HTTP Working Group <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
On Wed, Oct 28, 2020 at 1:28 AM Mark Nottingham <mnot@mnot.net> wrote:

> Structured Fields is in AUTH48 and we've addressed everything that's come
> up except for one very late entrant. I know this is very last minute, but
> I'm becoming convinced that this is something we should consider changing
> before shipping.
>
> Background: I've written a script that validates HTTP messages in RFC XML,
> including Structured Fields. See:
>   https://pypi.org/project/rfc-http-validate/
>
> Applying this to our current drafts, I encountered a problem; if a header
> field contains binary data, it's extremely likely that it will need to wrap
> across multiple lines to fit into the RFC. As a reminder, such folded lines
> are required by HTTP to be replaced by one or more spaces in <
> https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#field-values
> >.
>
> For example, here is the PR for the Signature draft:
>   https://github.com/httpwg/http-extensions/pull/1319
>
> At first I thought this could be addressed by an editorial note explaining
> that whitespace folding is different in examples. However, things like this
> make that unworkable:
>
> ~~~ http-message
> Signature-Input: sig1=(*request-target *created host date
>      cache-control x-empty-header x-example); keyid="test-key-a";
>      alg=hs2019; created=1402170695; expires=1402170995
>  Signature: sig1=:K2qGT5srn2OGbOIDzQ6kYT+ruaycnDAAUpKv+ePFfD0RAxn/1BUe
>      Zx/Kdrq32DrfakQ6bPsvB9aqZqognNT6be4olHROIkeV879RrsrObury8L9SCEibe
>     oHyqU/yCjphSmEdd7WD+zrchK57quskKwRefy2iEC5S2uAH0EPyOZKWlvbKmKu5q4
>     CaB8X/I5/+HLZLGvDiezqi6/7p2Gngf5hwZ0lSdy39vyNMaaAT0tKo6nuVw0S1MVg
>     1Q7MpWYZs0soHjttq0uLIA3DIbQfLiIvK6/l0BdWTU7+2uQj7lBkQAsFZHoA96ZZg
>     FquQrXRlmYOh+Hx5D9fJkXcXe5tmAg==:
> ~~~
>
> As you can see, whitespace in folding is semantically significant in
> Signature-Input (if it's lost, delimitation will be lost too), whereas it
> needs to be removed for Signature to contain valid binary content.
>

Another option is to use a line-folding strategy from RFC 8792 instead of
using the obsolete line folding from section 3.2.4 of RFC 7230.

>
> So, the obvious fix is to allow whitespace inside binary content.
> Delimitation won't be lost, because it's ":" on both ends. The base64
> parsers I checked already swallow whitespace in input (not surprising since
> the motivating use case for base64 was line-wrapped MIME).
>
> The question is whether it's too late to do this. Personally I think it's
> worth it; otherwise we're going to have some pretty confusing specs, and
> that's likely to lead to problems. Also, the delta to the spec and
> implementations is very small. Also, if there's some implementation lag I
> think that's workable, because this is less likely to be seen on the wire,
> and there aren't too many adopters of binary content yet.
>
> What do folks think? I'll start a PR to show what it'd be like, but I
> wanted to get early impressions ASAP.
>

I don't think it makes sense to modify the format of structured headers to
be compatible with an obsolete line folding format (which we already have
language that states a sender MUST NOT use that format). Allowing
whitespace only seems to add complexity and ambiguity. The only benefit I
see is for writing documentation where lines need to be wrapped, and there
are better solutions for that, such as using RFC 8792 line folding, an
editorial comment that whitespace folding is different for specific header
fields (e.g. apply different rules to Signature vs Signature-Input), or
tooling detecting that certain header fields use structured headers and
should automatically apply different folding rules for those headers.

>
> Thanks (and sorry for not seeing this earlier),
>
> --
> Mark Nottingham   https://www.mnot.net/
>
>
>
Received on Wednesday, 28 October 2020 18:00:01 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 28 October 2020 18:00:02 UTC