W3C home > Mailing lists > Public > www-style@w3.org > October 2012

Re: [css2.1] tokenizer syntax - handling escaped null in badstring

From: Simon Sapin <simon.sapin@kozea.fr>
Date: Sun, 07 Oct 2012 08:06:51 +0200
Message-ID: <50711BFB.9050405@kozea.fr>
To: www-style@w3.org
Le 07/10/2012 06:30, Glenn Adams a écrit :
> I'm referring to what the spec would have one do, as opposed to what UAs
> actually do. Do you agree the tokenizer rule as specified would consume
> an escaped NULL (whether or not a UA actually allows a NULL to get that
> far)?

Yes, this is my understanding of the regexps that define tokenizer. 
U+0000 matches the [^\n\r\f0-9a-f] part of the 'escape' macro and thus 
can be escaped with a back-slash. Or it can be unescaped, a normal 
character inside a quoted string, or a DELIM token outside.

If we ignore the "undefined" part, U+0000 in CSS behaves just like 
U+0001 and many other code points. And I think it should. Zero as a 
string terminator is not universal, it is only an implementation detail 
of some systems. Sure, we can accommodate such systems by allowing them 
to use U+FFFD or something, but I see no reason to make U+0000 be a 
terminator on systems that are perfectly fine with a null byte or 
codepoint in the middle of a string.

In any case, any change (from undefined) in this area will probably go 
in css3-syntax rather than CSS 2.1.

Simon Sapin
Received on Sunday, 7 October 2012 06:08:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:22 UTC