Re: LWS around header names from Alex Rousskov on 2004-03-16 (ietf-http-wg@w3.org from January to March 2004)

From: Alex Rousskov <rousskov@measurement-factory.com>
Date: Mon, 15 Mar 2004 17:33:40 -0700 (MST)
To: Jamie Lokier <jamie@shareable.org>
Cc: ietf-http-wg@w3.org
Message-ID: <Pine.BSF.4.58.0403151638050.17944@measurement-factory.com>

On Mon, 15 Mar 2004, Jamie Lokier wrote:

> Unfortunately, fixing it has introduced new, albeit remote, security
> implications. ...  The new Apache as origin server will treat it as
> a proper Authorization header, and may send a response that is
> inappropriately cached, unknown to the origin server.

Oof! And my old code may even be responsible for the Squid parsing
bug! :-/

> You see?  It's fixed a bug, and removed one obscure security
> implication, but replaced it with a new one.

There is no "new" bug if you consider that there are origin servers
other than Apache out there and that some of them might be compliant
in this parsing aspect.

It looks like to avoid compatibility problems with broken proxies,
Apache origin server should not authenticate based on valid but
"difficult to parse" Authentication header.

> Why would the text explicitly mention LWS after the colon but
> nothing about LWS before it

Because that is the way RFC 2616 and most other RFCs are written:
Prose text comments on typical use cases or known compatibility
issues. Formal grammars are supposed to cover everything. This is not
how most RFCs are read and implemented, unfortunately.

> I think this is an example of the text "noting otherwise", even
> though it does not explicitly say that it is noting otherwise.

Please remember this if you happen to write an RFC :-)

> So, I'm saying the standard is ambiguous at that point -- either
> reading is possible, and a clarification would be good.

Agreed: A clarification would be good. Please submit an errata. A
short errata may help to engage HTTP authors/gurus that are most
likely ignoring these long messages.

> > >     2. What about LWS before the field-name?
> >
> > Do you mean SP or HT before the field-name? CRLF before the field-name
> > would indicate the end of headers (the field-name would be a part of
> > the body then).
>
> Yes.

IMO, SP or HT before non-first header is line folding and should be
interpreted as such. SP or HT before the first header is a malformed
message that should be rejected.

> Both of these behaviours are bugs, but worse than that: they're both
> security holes.  The same kind of hole which motivated your patch to
> Apache, but through a slightly different route.

Indeed!

> Why?  Because passing them along, in either direction, enables the
> exact remote security exploits which motivated the patch to Apache
> to allow LWS before the colon.

I think I agree. At least, I am glad that there is a security issue
(albeit a remote one) that illustrates why implementing specs by
looking at examples is dangerous.

> Is it not better to reject messages which are clearly out of spec?

In an ideal world, yes. The common interpretation of IETF philosophy
to convert garbage input into compliant output and the (related) real
world demands do not allow that kind of purity, at least not for HTTP.

> I think it's reasonable for the RFC to suggest implementation SHOULD
> reject such headers,

I am sure that new restrictions will break some old implementations.
HTTP has been around for too long. Since the real issue here is
security, perhaps the errata should simply explain why using
Authorization header-based authentication is a bad idea, regardless of
the header value. We now have SSL/TLS instead (with their own security
flaws :).

Alex.

Received on Monday, 15 March 2004 19:33:43 UTC