Re: [css3-syntax] CSS escape sequences

Le 12/01/2012 11:35, Mathias Bynens a écrit :
> There seems to be another way to escape these characters, namely by
> breaking them up in UTF-16 code units: `\d834\df06 `. All browsers
> except Gecko (https://bugzilla.mozilla.org/show_bug.cgi?id=717529)
> seem to support this, even though this isn’t mentioned in the spec.

Hi,

Isn’t this an accident due to using UCS-2 internally (fixed 16 bits 
encoding) and pretend it is UTF-16? (Or the reverse...)

The CSS syntax is defined in terms of Unicode/ISO 10646 code points. 
UTF-16 surrogate pairs like 0xd834-0xdf06 only exist when serializing 
code points to UTF-16 bytes.

For example, the fact that len(u'\U0001d306') is 2 on some builds of 
Python is a bug in Python (it should be 1), not a reality of how Unicode 
works. (I use Python syntax for the example, but the same bug exist in 
many other platforms.)

Regards,
-- 
Simon Sapin

Received on Thursday, 12 January 2012 13:22:42 UTC