- From: Glenn Adams <glenn@skynav.com>
- Date: Sun, 7 Oct 2012 09:56:03 +0800
- To: Simon Sapin <simon.sapin@kozea.fr>
- Cc: WWW Style <www-style@w3.org>
- Message-ID: <CACQ=j+dKV0kz70g_xQmH03iGC4gDCf2FGXtCgtK=uPVCa9SWAA@mail.gmail.com>
On Sat, Oct 6, 2012 at 6:32 PM, Simon Sapin <simon.sapin@kozea.fr> wrote: > Le 06/10/2012 05:58, Glenn Adams a écrit : > > The current tokenizer syntax [1] specifies: >> >> escape {unicode}|\\[^\r\n\f0-9a-f] >> badstring1 \"([^\n\r\f\\"]|\\{nl}|{**escape})*\\? >> >> Given the following input string: >> >> < U+0022 (QUOTATION MARK), U+005C (REVERSE SOLIDUS), U+0000 (NULL) > >> >> Does the < U+005C, U+0000 > match escape or does it match the final \\? >> ? That is, should U+0000 be treated as an escapable character or as EOF >> (EOS)? The above grammar suggests the former. >> >> [1] http://www.w3.org/TR/CSS2/**grammar.html<http://www.w3.org/TR/CSS2/grammar.html> >> > > > The closest spec text I could find is in §4.1.3: > > (It is undefined in CSS 2.1 what happens if a style sheet does >> contain a character with Unicode codepoint zero.) >> > > Although it is in a paragraph about hexadecimal escapes, I guess it could > apply to you example too. OK, but as the current syntax is written for the escape non-terminal, it will definitely match an escaped NULL. I would have preferred to see NULL excluded from escaping, i.e., always treating it as EOF/EOS for the purpose of defining normative tokenization processing.
Received on Sunday, 7 October 2012 01:56:58 UTC