Re: [CSS21] Question on character escapes from smontagu@smontagu.org on 2005-11-15 (www-style@w3.org from November 2005)

From: <smontagu@smontagu.org>
Date: Tue, 15 Nov 2005 11:12:53 -0800 (PST)
To: "Ian Hickson" <ian@hixie.ch>
Cc: "Boris Zbarsky" <bzbarsky@mit.edu>, "www-style Mailing List" <www-style@w3.org>
Message-ID: <12693.195.212.29.92.1132081973.squirrel@webmail.smontagu.org>

>> I'm wondering what happens when \nnnnnnn escapes (backslash followed by
>> numbers) are used and the resulting character is invalid (eg it's a high
>> or low surrogate, or is above 0x00110000).  Should the escape be treated
>> as U+FFFD?  Or should this be considered an error and error recovery
>> (skipping a declaration or whatever needs to happen at that point in
>> parsing) happen?  Or something else?
>
> The spec doesn't say. It also doesn't say what should happen with \0
> (indeed it calls that one out explicitly). I suggest treating them all as
> U+FFFD, and only dropping the rule if U+FFFD would cause the rule to be
> dropped at that point. (The idea is that a literal reading of 2.1 suggests
> that no codepoints can be invalid except 0, and so they should be treated
> the same way valid-but-unknown characters would be.)

In the case of values above 0x00110000, they are not "codepoints", at
least by the Unicode definition D4b in
http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf

Received on Tuesday, 15 November 2005 19:13:03 UTC