W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2009

Re: #173: CR and LF in chunk extension values

From: Mark Nottingham <mnot@mnot.net>
Date: Mon, 24 Aug 2009 14:45:40 +1000
Cc: Henrik Nordstrom <henrik@henriknordstrom.net>, Bjoern Hoehrmann <derhoermi@gmx.net>
Message-Id: <09372E1C-07B4-4DEF-9C35-6AEE0A3F9DA1@mnot.net>
To: HTTP Working Group <ietf-http-wg@w3.org>
That leaves us at:

1) Replace OWS in qdtext with space and tab, and
2) Remove obs-text from qdtext, and
3) Restrict quoted-text to VCHAR.

Milestone assigned for -08; barring any other discussion, we'll see  
what the editors come up with in that revision.


On 12/08/2009, at 4:43 PM, Mark Nottingham wrote:

> Right now, it's defined as:
>
>> A string of text is parsed as a single word if it is quoted using
>> double-quote marks.
>>
>>  quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
>>  qdtext         = OWS / %x21 / %x23-5B / %x5D-7E / obs-text
>>                 ; OWS / <VCHAR except DQUOTE and "\"> / obs-text
>>  obs-text       = %x80-FF
>>
>> The backslash character ("\") MAY be used as a single-character
>> quoting mechanism only within quoted-string and comment constructs.
>>
>>  quoted-text    = %x01-09 /
>>                   %x0B-0C /
>>                   %x0E-FF ; Characters excluding NUL, CR and LF
>>  quoted-pair    = "\" quoted-text
>
> So it seems like we need to:
>
> 1) Consider removing OWS from qdtext, replacing it with space and  
> tab only. While we could use BWS here, receivers are required to  
> accept it, which I don't think is the desired effect. And,
>
> 2) Consider removing obs-text from qdtext, as it's a hole that a  
> truck can drive through. Otherwise, modify it to explicitly disallow  
> CTLs. And,
>
> 3) Restrict the allowable set of characters in quoted-text to  
> disallow CTLs. VCHAR?
>
>
>
> On 11/08/2009, at 8:50 AM, Henrik Nordstrom wrote:
>
>> tis 2009-08-11 klockan 05:31 +1000 skrev Mark Nottingham:
>>> This was discussed in Stockholm, and there was agreement in the room
>>> that the proper way to address this is to disallow CR and LF in  
>>> *any*
>>> quoted-string.
>>>
>>> Comments?
>>
>> Escaped newlines or \0 characters in the form of quoted-pair very  
>> likely
>> to cause many parsers to fail no matter where these are seen. I  
>> know I
>> have always understood this as a mechanism intended for quoting  
>> special
>> characters like " ( and ),  and not including CTLs.
>>
>> Regarding chunked encoding allowing any newlines there is a very very
>> bad idea. Folding is not supported there, and no one expects to see
>> newlines in the middle of a chunk header quoted or not.
>>
>> I would propose changing quoted-pair to restrict the allowable set to
>> non-CTLs to match most expectations on what values may be seen, not  
>> only
>> excluding CR or LF.
>>
>>   quoted-pair  = "\" <any CHAR except CTLs>
>>
>> instead of
>>
>>   quoted-pair  = "\" CHAR
>>
>> Regards
>> Henrik
>>
>
>
> --
> Mark Nottingham     http://www.mnot.net/
>
>


--
Mark Nottingham     http://www.mnot.net/
Received on Monday, 24 August 2009 04:46:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:08 GMT