Re: CRLF requirement from Zhong Yu on 2014-07-01 (ietf-http-wg@w3.org from July to September 2014)

From: Zhong Yu <zhong.j.yu@gmail.com>
Date: Tue, 1 Jul 2014 09:36:01 -0500
To: Willy Tarreau <w@1wt.eu>
Cc: Anne van Kesteren <annevk@annevk.nl>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CACuKZqHRZLdKsQLpDiLLC_qqrW2wpJFkC4yDdZTXB0Gf=E9scg@mail.gmail.com>

On Mon, Jun 30, 2014 at 6:35 AM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Anne,
>
> On Mon, Jun 30, 2014 at 01:23:50PM +0200, Anne van Kesteren wrote:
>> Why does https://tools.ietf.org/html/rfc7230 still require CRLF while
>> most implementations handle either CR or LF fine? As far as I can tell
>> line parsing in HTTP is interoperable with text/plain, text/html,
>> text/css, etc. It's newline = CR / LF / CRLF.
>
> Please take a look at 3.5. "Message Parsing Robustness" :
>
>   "Although the line terminator for the start-line and header fields is
>    the sequence CRLF, a recipient MAY recognize a single LF as a line
>    terminator and ignore any preceding CR."

The text in RFC2616 was
http://tools.ietf.org/html/rfc2616#section-19.3
> we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore **the** leading CR.

The change from [CR] LF to *CR LF is dangerous, IMO. Previously,
CRCRLF must be rejected, now, in rfc7230, it MAY be accepted as one
line break. (And then Anne wanted it to be accepted as two line
breaks).

It's always troubling if different agents MAY interpret the same
message differently. We cannot add any more leniency to parsing. If
any new software generates anything but CRLF in 2014, it's just sloppy
programming.

Zhong Yu
bayou.io

Received on Tuesday, 1 July 2014 14:36:28 UTC