W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2020

Re: Structured Fields: whitespace in binary content

From: Ian Clelland <iclelland@google.com>
Date: Wed, 28 Oct 2020 08:35:44 -0400
Message-ID: <CAK_TSX+_n4ViPjXJ+t94MzdzOgEB7LOdDChxUHDFwz7CKc1Csg@mail.gmail.com>
To: Cory Benfield <cory@lukasa.co.uk>
Cc: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
On Wed, Oct 28, 2020 at 6:17 AM Cory Benfield <cory@lukasa.co.uk> wrote:

> I don't love this change. It's a frustrating additional complexity in
> the parser, and it seems unfortunate that we're adding it for reasons
> that have nothing to do with whether we want it and everything to do
> with limitations of the format we publish the spec in. I think we
> shouldn't do it because if we wrote down in the RFC _why_ SP is
> allowed, we'd feel fairly silly.
>
> On Wed, 28 Oct 2020 at 08:49, Mark Nottingham <mnot@mnot.net> wrote:
> >
> > Here's the diff:
> >   https://github.com/httpwg/http-extensions/pull/1322/files
> >
> > There are other ways to do this, of course, and I suspect that many
> implementations will just feed the SP into their base64 parser, if they're
> confident it will handle it well.
>

I just checked, and Chromium's base64 implementation (
https://source.chromium.org/chromium/chromium/src/+/master:third_party/modp_b64/modp_b64.cc;
originally
http://web.archive.org/web/20060617172031/http://modp.com/release/base64/)
does not accept whitespace in base64-encoded data. Not that it's
necessarily a huge engineering task to produce or switch to a decoder which
does, but not trivial either. I'm not as familiar with Gecko source code,
but I did find at least one decoder in that source tree with the same
issues.


> >
> > Cheers,
> >
> >
> > > On 28 Oct 2020, at 7:23 pm, Mark Nottingham <mnot@mnot.net> wrote:
> > >
> > > Structured Fields is in AUTH48 and we've addressed everything that's
> come up except for one very late entrant. I know this is very last minute,
> but I'm becoming convinced that this is something we should consider
> changing before shipping.
> > >
> > > Background: I've written a script that validates HTTP messages in RFC
> XML, including Structured Fields. See:
> > >  https://pypi.org/project/rfc-http-validate/
> > >
> > > Applying this to our current drafts, I encountered a problem; if a
> header field contains binary data, it's extremely likely that it will need
> to wrap across multiple lines to fit into the RFC. As a reminder, such
> folded lines are required by HTTP to be replaced by one or more spaces in <
> https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#field-values
> >.
> > >
> > > For example, here is the PR for the Signature draft:
> > >  https://github.com/httpwg/http-extensions/pull/1319
> > >
> > > At first I thought this could be addressed by an editorial note
> explaining that whitespace folding is different in examples. However,
> things like this make that unworkable:
> > >
> > > ~~~ http-message
> > > Signature-Input: sig1=(*request-target *created host date
> > >     cache-control x-empty-header x-example); keyid="test-key-a";
> > >     alg=hs2019; created=1402170695; expires=1402170995
> > > Signature: sig1=:K2qGT5srn2OGbOIDzQ6kYT+ruaycnDAAUpKv+ePFfD0RAxn/1BUe
> > >     Zx/Kdrq32DrfakQ6bPsvB9aqZqognNT6be4olHROIkeV879RrsrObury8L9SCEibe
> > >    oHyqU/yCjphSmEdd7WD+zrchK57quskKwRefy2iEC5S2uAH0EPyOZKWlvbKmKu5q4
> > >    CaB8X/I5/+HLZLGvDiezqi6/7p2Gngf5hwZ0lSdy39vyNMaaAT0tKo6nuVw0S1MVg
> > >    1Q7MpWYZs0soHjttq0uLIA3DIbQfLiIvK6/l0BdWTU7+2uQj7lBkQAsFZHoA96ZZg
> > >    FquQrXRlmYOh+Hx5D9fJkXcXe5tmAg==:
> > > ~~~
> > >
> > > As you can see, whitespace in folding is semantically significant in
> Signature-Input (if it's lost, delimitation will be lost too), whereas it
> needs to be removed for Signature to contain valid binary content.
> > >
> > > So, the obvious fix is to allow whitespace inside binary content.
> Delimitation won't be lost, because it's ":" on both ends. The base64
> parsers I checked already swallow whitespace in input (not surprising since
> the motivating use case for base64 was line-wrapped MIME).
> > >
> > > The question is whether it's too late to do this. Personally I think
> it's worth it; otherwise we're going to have some pretty confusing specs,
> and that's likely to lead to problems. Also, the delta to the spec and
> implementations is very small. Also, if there's some implementation lag I
> think that's workable, because this is less likely to be seen on the wire,
> and there aren't too many adopters of binary content yet.
> > >
> > > What do folks think? I'll start a PR to show what it'd be like, but I
> wanted to get early impressions ASAP.
> > >
> > > Thanks (and sorry for not seeing this earlier),
> > >
> > > --
> > > Mark Nottingham   https://www.mnot.net/
> > >
> > >
> >
> > --
> > Mark Nottingham   https://www.mnot.net/
> >
> >
>
>
Received on Wednesday, 28 October 2020 12:36:12 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 28 October 2020 12:36:13 UTC