Re: Implied LWS questions

On 6 Jun 2008, at 15:18, Julian Reschke wrote:

> "The version of an HTTP message is indicated by an HTTP-Version  
> field in the first line of the message. HTTP-Version is case- 
> sensitive.
>
>  HTTP-Version   = "HTTP" "/" 1*DIGIT "." 1*DIGIT"
>
> So, do "HTTP" and "/" qualify as instances of quoted-string?
>
> What about 1*DIGIT? That's definitively not a quoted string, but it  
> could be parsed as token.
>
> So, after applying the implied LWS rule, what is the ABNF for HTTP- 
> Version?
>
> 1) HTTP-Version = "HTTP" *LWS "/" *LWS 1*DIGIT *LWS "." *LWS 1*DIGIT
>
> or
>
> 2) HTTP-Version = "HTTP" *LWS "/" 1*DIGIT "." 1*DIGIT

Exactly per RFC2616, I'd say Dan's 3).

However, in the real world everything is rather crazy (this is testing  
by the comparatively sane 2) with a single space character). Saf/Mac  
treats it as a HTTP/0.9 response (i.e., all the data sent by the  
server is the response); Firefox goes crazy (the status code is given  
as 200, and the status text is "200 OK"), but the rest is as would be  
the case without any *LWS (it seems to just split on the first two SP  
characters); Opera copes with a single SP there per 1) or 2); IE is  
even crazier, with a status code of -9 and the status text of "200  
OK", though everything after the response line is fine; Apache seems  
to cope fine; HTTP.sys (in IIS/7.0) returns 400 Bad Request. I expect  
stuff gets even madder once you use CRLF SP as that *LWS (like stuff  
being treated as headers and the like), or even just multiple spaces.

I think it's safe to say that no specific behaviour is needed in the  
real world, which is why I'd lean towards following Saf/Mac and  
HTTP.sys: either outright rejecting it as invalid, or falling back to  
treating it as HTTP/0.9 (this is, FWIW, I do in my ultra-early-draft  
HTTP parsing spec).

As for http-url, I'd say it doesn't allow it anywhere.


--
Geoffrey Sneddon
<http://gsnedders.com/>

Received on Friday, 6 June 2008 17:08:56 UTC