- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Wed, 19 Jan 2011 11:35:56 +0000
- To: Julian Reschke <julian.reschke@gmx.de>
- CC: "public-html@w3.org" <public-html@w3.org>
Julian Reschke wrote:
> 6.
> Process the next character as follows:
>
> If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022
> QUOTATION MARK ('"') (NOT immediately following an U+005C REVERSE
> SOLIDUS ("\") character) in s
> If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027
> APOSTROPHE ("'") in s
> Return the encoding corresponding to the backslash-unescaped
> string between this characters and the next earliest occurrence of this
> character.
Say I have {charset="foo\\"bar"}. The {"} before {b} is preceded by a
{\}, so it won't match this case - surely it should do, else there's no
way for quoted-string to safely quote a string ending in {\} because the
closing {"} will never match?
In this case the final {"} will match but then the "next earliest
occurrence of this character" is the {"} before {b}, so this will return
{foo\\} - shouldn't it collect all characters up to the non-escaped {"}
instead? Otherwise quoted-string can't safely quote any string
containing {"}.
> "backslash-unescaping" a string replaces each sequence of U+005C REVERSE
> SOLIDUS ("\") and the following character by just that character. If the
> last character of the string is a U+005C REVERSE SOLIDUS ("\"), the
> algorithm returns nothing.
The last character of the string before unescaping, or after? Either
way, why shouldn't I be able to quote a string like {foo\}?
(By the way, is the RCF2616 grammar ambiguous? It says
quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext = <any TEXT except <">>
quoted-pair = "\" CHAR
so a string like {"\\"} could be parsed as
<"> qdtext qdtext <">
or as
<"> quoted-pair <">
and it's not clear whether the {\\} is meant to be interpreted as a
quoted pair or as two separate characters. I'm assuming that it should
be a pair but don't see that defined anywhere.)
--
Philip Taylor
pjt47@cam.ac.uk
Received on Wednesday, 19 January 2011 11:36:29 UTC