Re: [svgwg] Character counting in text 'x', 'y', 'dx', 'dy', and 'rotate' attributes. (#537)

The SVG Working Group just discussed `Character counting`, and agreed to the following:

* `RESOLUTION: counting by use real  unicode code points (not UTF16 blocks) Ignore/combine attribute values that are assigned to code points that are clustered with a previous character`

<details><summary>The full IRC log of that discussion</summary>
&lt;krit> topic: Character counting<br>
&lt;krit> github https://github.com/w3c/svgwg/issues/537<br>
&lt;krit> github: https://github.com/w3c/svgwg/issues/537<br>
&lt;myles> hi<br>
&lt;krit> myles: I think the issue has 3 proposals<br>
&lt;krit> myles: either use code units, graphim clusters or a complicated thing that takes properties into account<br>
&lt;krit> AmeliaBR: 1st option is UTF16 blocks or code points<br>
&lt;krit> AmeliaBR: no one seems to implement blocks<br>
&lt;krit> AmeliaBR: emojis would count as 2 for instance<br>
&lt;krit> myles: emojis are most compelling for geaphim clusters<br>
&lt;krit> AmeliaBR: none of the proposals were about breaking about complex glyphs for layout<br>
&lt;krit> chris: blocks are not ideal but I thought browsers use them<br>
&lt;krit> AmeliaBR: some do<br>
&lt;krit> myles: cahr position is not blcok or graphim cluster or anything. So not the best to describe the issue<br>
&lt;myles> dx="3 4 5 6"<br>
&lt;myles> "hi❤️k"<br>
&lt;krit> myles: if I got a string  (typing above)<br>
&lt;krit> myles: heart is 2 code points<br>
&lt;krit> myles: 8 is going with 3 is what you saying_<br>
&lt;krit> chris: 5 is heart<br>
&lt;krit> chris: 7 would affect the k<br>
&lt;krit> AmeliaBR: if it is 2 code points it would accumulate into the next set of characters<br>
&lt;krit> myles: how do you know is 2 items in the list? because it has 2 cod epoints?<br>
&lt;krit> myles: how do you know what a unit is (like the heart)<br>
&lt;krit> chris: I know what a geaphim cluster is when I see it but technically it might mean different thigns.<br>
&lt;krit> chris: Tav for instance was scared that it might mean different things depending on the properties of the char of font<br>
&lt;krit> AmeliaBR: Especially ligatures that are font specific make it more difficulz<br>
&lt;krit> myles: ligatue in the font would still be 2 graphim cluster with multiple code points but that affects the rendeirng only<br>
&lt;krit> chris: If you have a ffi it would be one cluster<br>
&lt;krit> myles: it would force it to break the ligature<br>
&lt;krit> chris: I see<br>
&lt;krit> AmeliaBR: we have different rules in SVG for ligatures<br>
&lt;krit> myles: most natural way for things like arribic there is not a straight forward to do it. Best we can do is try to do what CSS for hyphenation does. You break the text at best place but breaking location is shaped as if it was not broken. Next to hyphenation you get the media form. We should use the same mechanism<br>
&lt;krit> chris: Does WebKit have accesss to it from CoreText engine?<br>
&lt;krit> myles: we have access to it and think we should do it<br>
&lt;krit> AmeliaBR: I d like to see the proposal written but is less important than counting<br>
&lt;krit> AmeliaBR: breaking a ligature differs between implementation than just that glyph is off<br>
&lt;krit> AmeliaBR: but if counting differs the entire text might look different<br>
&lt;krit> myles: with the proposal to ignore / limp in items in the list when it conforms to the first code point in the graphim cluster... In that proposal the exact boundaries of the cluster might not count much<br>
&lt;krit> myles: so next char might get into the same place<br>
&lt;krit> AmeliaBR: that is what unid code points make different<br>
&lt;krit> myles: do they would be local to where the engine chops of the text<br>
&lt;krit> myles: so heart would be at 5 and k would be at the same place still<br>
&lt;krit> AmeliaBR: if browser does not recognize a given emoji sequnece as a char, it would still do the counting consistent after the char<br>
&lt;krit> proposed RESOLUTION: counting use real  unicode code points (not UTF16 blocks)<br>
&lt;krit> proposed RESOLUTION: counting by use real  unicode code points (not UTF16 blocks) Ignore code points that are not part of a cluster group<br>
&lt;myles> proposed RESOLUTION: counting by use real  unicode code points (not UTF16 blocks). Ignore code points that are not the first item in a cluster group<br>
&lt;myles> proposed RESOLUTION: counting by use real  unicode code points (not UTF16 blocks). Ignore attribute values that are assigned to code points that are clustered with a previous code point<br>
&lt;AmeliaBR> proposed RESOLUTION: counting by use real  unicode code points (not UTF16 blocks) Ignore/combine attribute values that are assigned to code points that are clustered with a previous character<br>
&lt;krit> RESOLUTION: counting by use real  unicode code points (not UTF16 blocks) Ignore/combine attribute values that are assigned to code points that are clustered with a previous character<br>
&lt;krit> AmeliaBR: we should use a seperate issue to talk about the rendering proposed by myles<br>
&lt;krit> myles: I ll start the new issue<br>
&lt;krit> chair: krit<br>
&lt;krit> trackbot, end telcon<br>
</details>


-- 
GitHub Notification of comment by css-meeting-bot
Please view or discuss this issue at https://github.com/w3c/svgwg/issues/537#issuecomment-452092266 using your GitHub account

Received on Monday, 7 January 2019 21:42:49 UTC