Re: Implied LWS questions

Julian Reschke wrote:
>   HTTP-Version   = "HTTP" "/" 1*DIGIT "." 1*DIGIT"
> 
> So, do "HTTP" and "/" qualify as instances of quoted-string?

I think the grammar is eating your brain :-). "quoted-string" means that
the *result* is quoted, not the *rule*. The strings generated by those
rules don't have quotes around them, so no, they're not quoted strings.
But "HTTP" is a token, and "/" is a separator, so yes, *LWS is allowed
between them.

> So, after applying the implied LWS rule, what is the ABNF for HTTP-Version?
> 
> 1) HTTP-Version = "HTTP" *LWS "/" *LWS 1*DIGIT *LWS "." *LWS 1*DIGIT
> 
> or
> 
> 2) HTTP-Version = "HTTP" *LWS "/" 1*DIGIT "." 1*DIGIT

I would say

  3) HTTP-Version = "HTTP" *LWS "/" *LWS 1*DIGIT "." 1*DIGIT

Because "/" is a separator and "HTTP" and 1*DIGIT are tokens, but "." is
neither a separator nor a token.

> So, does http-URL allow *LWS anywhere? It's certainly not supposed to,
> but I think the rules allow it between "http:" and "//".

As I read it, you can't put *LWS between "http:" and "//", because
neither of those strings is a token, a quoted string, or a separator.
(I'm assuming you're not allowed to treat the string "http:" as though
it was the same as "http" ":".) However, you could have *LWS around the
":" separating host and port, or the "?" separating path and query.

But I'd totally believe you if you claimed that 2616 never meant for
implied *LWS to apply inside rules imported from other specs.


Request-Line has the same problem as http-URL with *LWS around "?".
Given that there are already rules in the running text stating that you
can't have CR or LF in the Request-Line or Status-Line (meaning that any
implied *LWS appearing there is not really LWS anyway...), we should
probably just say that the implied *LWS rule was also not meant to apply
to the Request-Line and Status-Line. (This is already implied by the
"Tolerant Applications" appendix; there would be no need to say that
apps "SHOULD accept any amount of SP or HTAB characters between fields"
if implied *LWS was in effect there.)


Another major oddity of implied *LWS is this:

    chunk          = chunk-size [ chunk-extension ] CRLF
                     chunk-data CRLF
    chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )

Since chunk-size can be parsed as a token, and chunk-extension starts
with a separator, you can in theory put *LWS between them. Which makes
the parsing ambiguous because there's no way to distinguish a
chunk-extension preceded by CR LF SP from a chunk-data that just happens
to look like a chunk-extension.

-- Dan

Received on Friday, 6 June 2008 16:19:06 UTC