W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2008

Re: Implied LWS questions

From: Henrik Nordstrom <henrik@henriknordstrom.net>
Date: Sun, 08 Jun 2008 11:59:59 +0200
To: Dan Winship <dan.winship@gmail.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <1212919199.22518.14.camel@henriknordstrom.net>

On fre, 2008-06-06 at 12:17 -0400, Dan Winship wrote:
> Julian Reschke wrote:
> >   HTTP-Version   = "HTTP" "/" 1*DIGIT "." 1*DIGIT"
> > 
> > So, do "HTTP" and "/" qualify as instances of quoted-string?
> 
> I think the grammar is eating your brain :-). "quoted-string" means that
> the *result* is quoted, not the *rule*. The strings generated by those
> rules don't have quotes around them, so no, they're not quoted strings.
> But "HTTP" is a token, and "/" is a separator, so yes, *LWS is allowed
> between them.

But it is also a typical case where LWS SHOULD NOT be allowed, as many
implementations will fail badly if there is LWS here.

> As I read it, you can't put *LWS between "http:" and "//", because
> neither of those strings is a token, a quoted string, or a separator.
> (I'm assuming you're not allowed to treat the string "http:" as though
> it was the same as "http" ":".) However, you could have *LWS around the
> ":" separating host and port, or the "?" separating path and query.

No. Space is explicitly disallowed in HTTP URLs, and LWS is parsed as a
single space, not "nothing".

There is special rules for URLs in textual context such as email body,
where LWS (including folding) is ignored. But this does not apply to
HTTP.

> Another major oddity of implied *LWS is this:
> 
>     chunk          = chunk-size [ chunk-extension ] CRLF
>                      chunk-data CRLF
>     chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
>
> Since chunk-size can be parsed as a token, and chunk-extension starts
> with a separator, you can in theory put *LWS between them. Which makes
> the parsing ambiguous because there's no way to distinguish a
> chunk-extension preceded by CR LF SP from a chunk-data that just happens
> to look like a chunk-extension.

Ouch..

This is a typical case where we need to explicitly forbid implied LWS,
and at most allow *(SP|HT) around the ; and chunk-size.

    chunk          = *(SP|HT) chunk-size *(SP|HT) [ chunk-extension ] CRLF
                     chunk-data CRLF
    chunk-extension= *( ";" *(SP|HT) chunk-ext-name [ "=" chunk-ext-val ] *(SP|HT))

I am sure there is better productions for this...


Regards
Henrik
Received on Sunday, 8 June 2008 10:00:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:48 GMT