- From: Tavmjong Bah <tav.w3c@gmail.com>
- Date: Fri, 21 Sep 2018 15:19:33 +0200
- To: behdad@behdad.org, www-svg <www-svg@w3.org>
- Message-ID: <CAHfwxgpnPcM9Zhusv9fYcDrhMjdb2JBJ=bxz-pX3q9d4i9M1Eg@mail.gmail.com>
Hi Behdad, I'm working on an issue for the SVG working group.(1) Your name came up in our discussion of the issue. I'm hoping that you can give us some guidance. As I'm sure you know, SVG allows characters to be shifted/rotated using the 'x', 'y', 'dx', 'dy', and 'rotate' attributes.(2) SVG 1.1 maps numbers in these attributes via Unicode code points (although it's not stated as clearly as it could be). So: <text x="10 20 30">ABC</text> would position the 'A' at x=10, 'B' at x=20, and 'C' at x=30. SVG 2 originally changed this mapping to use UTF-16 code units so that a character outside the Unicode base plane would consume two numbers. I think this was done to match how the DOM SVGTExtContentElement methods count characters.(3) Browser tests show that only Firefox uses UTF-16 code units. The consensus of the working group is not to use UTF-16 code units and to revert back to SVG 1.1 behavior which is seen to be more user friendly. Then came the issue of how to treat precomposed and decomposed characters. Amelia's tests(4) show that browsers treat 'u' + combining character as follows: * Firefox: consumes two numbers, translates/rotates as unit. * Chrome and Safari: consumes two numbers, translates/rotates separately. * Edge: consume one number (even in cases where there is not a precomposed equivalent). Everyone agrees that Chrome and Safari's behavior is wrong. There is some disagreement between if Firefox or Edge behavior is better, the first uses Unicode Code points, the second uses Extended Grapheme Clusters (UAX#29). My concern with Edge's behavior is that it requires an SVG renderer/editor to know what are Extended Grapheme Clusters which I imagine could be quite complex for some scripts. Pango appears to have this built in. Is this something that is exposed to a user? Is this something that is well defined for all (most) scripts? The advantage of using Unicode Code Points (besides being backwards compatible) is that it is well defined and predictable. We would appreciate any thoughts you would have about this. Thanks, Tav 1. https://github.com/w3c/svgwg/issues/537 2. https://www.w3.org/TR/SVG2/text.html#TSpanNotes 3. https://www.w3.org/TR/SVG2/text.html#InterfaceSVGTextContentElement 4. https://github.com/w3c/svgwg/issues/537#issuecomment-417823060
Received on Friday, 21 September 2018 13:20:07 UTC