- From: Jamie Lokier <jamie@shareable.org>
- Date: Wed, 23 Jun 2004 19:49:34 +0100
- To: Robert Olofsson <robo@khelekore.org>
- Cc: ietf-http-wg@w3.org
Robert Olofsson wrote: > Sort of, many servers are broken and output: > Content-Type: text/html\n\n<resource data> > > Instead of the correct one: > Content-Type: text/html\r\n\r\n<resource data> > > There are _many_ perl scripts (and other) that do this. > So being strict for CRLF parsing is a bad idea. That shouldn't be possible if the perl script is run through a decent CGI-supporting HTTP server -- as the HTTP server should canonicalise the line endings. However, obviously many are not, not to mention the Perl scripts which are servers in and of themselves. Fortunately, the HTTP standards warn of this and suggest just looking for \n and skipping a preceding \r. That's what I'd hope for: a document which contains all of the well-known suggestions for contemporary interoperation. For example, I have read that Mozilla, IE and Netscape browsers will all accept \r\n\r\r\n as the "blank line" which ends headers -- and that they must do so. I'm not sure if this is true, I have simply read it. It's not hard to imagine what kind of code generates \r\n\r\r\n I read the header parsing of Mozilla, and it accepts headers with no colon in the line -- effectively it ignores them (though it actually stores them verbartim). I presume that's because of buggy IIS servers which, according to the code in Squid, occasionally send 200 Blah Blah lines in the middle of the headers. See how much exciting historical information is hidden in these code bases! I worry about little things like some clients/servers/proxies accept lines with embedded \r's (not followed by \n), and some treat those as line endings while others do not, while yet others reject such lines. Some skip spaces before the header name, others do not, yet others correctly treat them as continutation lines. Some treat \r as a space for this, others do not. Some skip spaces after the header name, others do not. Some that do skip spaces after the header name, treat \r as a non-space in that place but treat \r as a space in other parts of the same line; some others treat \r as a space everywhere. Some treat a first header line which begins with space as an error, others as a header begining with a space even despite concatenating later such lines as continuations. Some treat a line containing only spaces as the "blank line" separator, others do not. Think of the delightful security possibilities, knowing that some proxies see some headers and other proxies/clients/servers see other headers in the same text! And that's just basic isolating of headers. We haven't even touched on the _values_ of headers. -- Jamie
Received on Wednesday, 23 June 2004 14:49:36 UTC