Re: Change Proposal for ISSUE-126

On 19.01.2011 12:35, Philip Taylor wrote:
> Julian Reschke wrote:
>> 6.
>> Process the next character as follows:
>>
>> If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022
>> QUOTATION MARK ('"') (NOT immediately following an U+005C REVERSE
>> SOLIDUS ("\") character) in s
>> If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027
>> APOSTROPHE ("'") in s
>> Return the encoding corresponding to the backslash-unescaped string
>> between this characters and the next earliest occurrence of this
>> character.
>
> Say I have {charset="foo\\"bar"}. The {"} before {b} is preceded by a
> {\}, so it won't match this case - surely it should do, else there's no
> way for quoted-string to safely quote a string ending in {\} because the
> closing {"} will never match?

Good catch.

> In this case the final {"} will match but then the "next earliest
> occurrence of this character" is the {"} before {b}, so this will return
> {foo\\} - shouldn't it collect all characters up to the non-escaped {"}
> instead? Otherwise quoted-string can't safely quote any string
> containing {"}.
>
>> "backslash-unescaping" a string replaces each sequence of U+005C
>> REVERSE SOLIDUS ("\") and the following character by just that
>> character. If the last character of the string is a U+005C REVERSE
>> SOLIDUS ("\"), the algorithm returns nothing.
>
> The last character of the string before unescaping, or after? Either
> way, why shouldn't I be able to quote a string like {foo\}?

Before. (And yes, there's another error here).

You can have quote the string

   foo\

using

   "foo\\"

A single backslash is invalid, and as far I as recall, the algorithm 
already treated certain malformed sequences this way, so I thought it's 
ok to do so here as well.

> (By the way, is the RCF2616 grammar ambiguous? It says
>
> quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
> qdtext = <any TEXT except <">>
> quoted-pair = "\" CHAR
>
> so a string like {"\\"} could be parsed as
> <"> qdtext qdtext <">
> or as
> <"> quoted-pair <">
> and it's not clear whether the {\\} is meant to be interpreted as a
> quoted pair or as two separate characters. I'm assuming that it should
> be a pair but don't see that defined anywhere.)

Yes, that's a known issue in 2616 that we fixed a long time ago, see 
<http://trac.tools.ietf.org/wg/httpbis/trac/ticket/31>.

Best regards, Julian

Received on Wednesday, 19 January 2011 12:10:06 UTC