W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2020

Re: Structured Fields: whitespace in binary content

From: Cory Benfield <cory@lukasa.co.uk>
Date: Wed, 28 Oct 2020 10:12:37 +0000
Message-ID: <CAH_hAJGa+g3fc2tqV_-MeVGPD0fZds6OPDuWOKUL0U35=X4u2Q@mail.gmail.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: HTTP Working Group <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
I don't love this change. It's a frustrating additional complexity in
the parser, and it seems unfortunate that we're adding it for reasons
that have nothing to do with whether we want it and everything to do
with limitations of the format we publish the spec in. I think we
shouldn't do it because if we wrote down in the RFC _why_ SP is
allowed, we'd feel fairly silly.

On Wed, 28 Oct 2020 at 08:49, Mark Nottingham <mnot@mnot.net> wrote:
>
> Here's the diff:
>   https://github.com/httpwg/http-extensions/pull/1322/files
>
> There are other ways to do this, of course, and I suspect that many implementations will just feed the SP into their base64 parser, if they're confident it will handle it well.
>
> Cheers,
>
>
> > On 28 Oct 2020, at 7:23 pm, Mark Nottingham <mnot@mnot.net> wrote:
> >
> > Structured Fields is in AUTH48 and we've addressed everything that's come up except for one very late entrant. I know this is very last minute, but I'm becoming convinced that this is something we should consider changing before shipping.
> >
> > Background: I've written a script that validates HTTP messages in RFC XML, including Structured Fields. See:
> >  https://pypi.org/project/rfc-http-validate/
> >
> > Applying this to our current drafts, I encountered a problem; if a header field contains binary data, it's extremely likely that it will need to wrap across multiple lines to fit into the RFC. As a reminder, such folded lines are required by HTTP to be replaced by one or more spaces in <https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#field-values>.
> >
> > For example, here is the PR for the Signature draft:
> >  https://github.com/httpwg/http-extensions/pull/1319
> >
> > At first I thought this could be addressed by an editorial note explaining that whitespace folding is different in examples. However, things like this make that unworkable:
> >
> > ~~~ http-message
> > Signature-Input: sig1=(*request-target *created host date
> >     cache-control x-empty-header x-example); keyid="test-key-a";
> >     alg=hs2019; created=1402170695; expires=1402170995
> > Signature: sig1=:K2qGT5srn2OGbOIDzQ6kYT+ruaycnDAAUpKv+ePFfD0RAxn/1BUe
> >     Zx/Kdrq32DrfakQ6bPsvB9aqZqognNT6be4olHROIkeV879RrsrObury8L9SCEibe
> >    oHyqU/yCjphSmEdd7WD+zrchK57quskKwRefy2iEC5S2uAH0EPyOZKWlvbKmKu5q4
> >    CaB8X/I5/+HLZLGvDiezqi6/7p2Gngf5hwZ0lSdy39vyNMaaAT0tKo6nuVw0S1MVg
> >    1Q7MpWYZs0soHjttq0uLIA3DIbQfLiIvK6/l0BdWTU7+2uQj7lBkQAsFZHoA96ZZg
> >    FquQrXRlmYOh+Hx5D9fJkXcXe5tmAg==:
> > ~~~
> >
> > As you can see, whitespace in folding is semantically significant in Signature-Input (if it's lost, delimitation will be lost too), whereas it needs to be removed for Signature to contain valid binary content.
> >
> > So, the obvious fix is to allow whitespace inside binary content. Delimitation won't be lost, because it's ":" on both ends. The base64 parsers I checked already swallow whitespace in input (not surprising since the motivating use case for base64 was line-wrapped MIME).
> >
> > The question is whether it's too late to do this. Personally I think it's worth it; otherwise we're going to have some pretty confusing specs, and that's likely to lead to problems. Also, the delta to the spec and implementations is very small. Also, if there's some implementation lag I think that's workable, because this is less likely to be seen on the wire, and there aren't too many adopters of binary content yet.
> >
> > What do folks think? I'll start a PR to show what it'd be like, but I wanted to get early impressions ASAP.
> >
> > Thanks (and sorry for not seeing this earlier),
> >
> > --
> > Mark Nottingham   https://www.mnot.net/
> >
> >
>
> --
> Mark Nottingham   https://www.mnot.net/
>
>
Received on Wednesday, 28 October 2020 10:13:02 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 28 October 2020 10:13:05 UTC