- From: Simon Sapin <simon.sapin@exyr.org>
- Date: Mon, 12 Aug 2013 19:15:04 +0100
- To: "Tab Atkins Jr." <jackalmage@gmail.com>
- CC: www-style list <www-style@w3.org>
Le 12/08/2013 18:36, Tab Atkins Jr. a écrit : > On Mon, Aug 12, 2013 at 9:59 AM, Simon Sapin <simon.sapin@exyr.org> wrote: >> Le 12/08/2013 17:25, Zack Weinberg a écrit : >>> On Mon, Aug 12, 2013 at 7:35 AM, Simon Sapin <simon.sapin@exyr.org> wrote: >>>> >>>> >>>> data:text/html,<style>body:before{}</style><script>document.styleSheets[0].cssRules[0].style.content="'-\ud834\udd1e-'"</script> >>> >>> >>> That JavaScript strings expose surrogate pairs to the programmer is a >>> (unfixable due to backward compatibility) specification bug in >>> JavaScript, which should not infect CSS; the behavior on our side >>> should IMHO be as-if the surrogate pair is converted to the >>> corresponding code point before tokenization, such that the modified >>> style sheet is indistinguishable from the one produced by >>> >>> data:text/html,<style>body:before{content:'-\01d11e -'}</style> >> >> >> Yes. That’s fine: surrogate pairs are how you’re supposed to do non-BMP >> codepoints in Javascript. The trouble is with unpaired surrogates: >> >> data:text/html,<style>body:before{}</style><script>document.styleSheets[0].cssRules[0].style.content="'-\ud834-\udd1e-'"</script> > > If implementations are willing to change, I'm fine with specifying > that unpaired surrogates get transformed into U+FFFD at CSS parse > time. Actually, none of the character encodings[1] allow unpaired surrogates, so the only way to get them in the CSS parser is through CSSOM. [1] http://encoding.spec.whatwg.org/ Equivalently, we can specify that JS strings from CSSOM are interpreted as UTF-16 bytes. -- Simon Sapin
Received on Monday, 12 August 2013 18:15:28 UTC