- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Thu, 12 Apr 2012 09:28:19 -0700
- To: Simon Sapin <simon.sapin@kozea.fr>
- Cc: www-style list <www-style@w3.org>
On Thu, Apr 12, 2012 at 9:08 AM, Simon Sapin <simon.sapin@kozea.fr> wrote: > Le 12/04/2012 17:22, Tab Atkins Jr. a écrit : >>> > I also suggest making the supported range implementation-dependent. >>> > The >>> > current highest unicode codepoint is 0x10ffff, but some "broken" >>> > platforms >>> > only support up to 0xffff (ie. only inside the BMP). >> >> CSS doesn't currently allow platforms to not support all of unicode. >> Do you have specific examples of platforms in use that are broken in >> this way that we should support? > > > CPython before 3.3 has a compile-time switch to make the internal storage > for codepoints UCS-4 instead of UCS-2. The sys.maxunicode constant reflects > that. (It is either 1114111 or 65535). Calling chr(x) with x > > sys.maxunicode raises an exception. > > Decoding a non-BMP character from bytes on an USC-2 build creates two > codepoints for the surrogate pair. This is wrong (eg. slicing can split the > pairs) but kind of works out when encoding back to bytes. > > Although I’m not as familiar with the details, I think that Java and > Javascript have similar issues. (Due to pretending that all of Unicode is > still 16 bits and UTF-16 is the same as UCS-2.) Javascript (and, I assume Python and Java) just need extra work to make this work correctly. It's an inconvenience, not a fundamental limitation. > Depending on what is done with the parsed stylesheet, decoding a single hex > escape to a surrogate pair of codepoints might "work" (as in, use the right > glyph if displayed on a screen eventually). Is this behavior acceptable? > (Maybe it does not matter for CSS?) Whether or not it "works" depends on the exact details of what you're doing, but it will at least have predictable behavior - both halves will fall into the "non-ASCII character" bucket and get processed fairly normally. In JS, emitting a string with a surrogate pair will work correctly. ~TJ
Received on Thursday, 12 April 2012 16:29:12 UTC