- From: Simon Sapin <simon.sapin@kozea.fr>
- Date: Thu, 12 Jan 2012 14:16:21 +0100
- To: www-style@w3.org
Le 12/01/2012 11:35, Mathias Bynens a écrit : > There seems to be another way to escape these characters, namely by > breaking them up in UTF-16 code units: `\d834\df06 `. All browsers > except Gecko (https://bugzilla.mozilla.org/show_bug.cgi?id=717529) > seem to support this, even though this isn’t mentioned in the spec. Hi, Isn’t this an accident due to using UCS-2 internally (fixed 16 bits encoding) and pretend it is UTF-16? (Or the reverse...) The CSS syntax is defined in terms of Unicode/ISO 10646 code points. UTF-16 surrogate pairs like 0xd834-0xdf06 only exist when serializing code points to UTF-16 bytes. For example, the fact that len(u'\U0001d306') is 2 on some builds of Python is a bug in Python (it should be 1), not a reality of how Unicode works. (I use Python syntax for the example, but the same bug exist in many other platforms.) Regards, -- Simon Sapin
Received on Thursday, 12 January 2012 13:22:42 UTC