Re: [css-houdini-drafts] [font-metrics-api] Revised proposal of font metrics for each character

The Houdini Task Force just discussed `FontMetrics`.

<details><summary>The full IRC log of that discussion</summary>
&lt;TabAtkins> Topic: FontMetrics<br>
&lt;TabAtkins> koji: This is about issue 828<br>
&lt;skk> https://github.com/w3c/css-houdini-drafts/issues/828<br>
&lt;emilio> GitHub: https://github.com/w3c/css-houdini-drafts/issues/828<br>
&lt;Rossen> github: https://github.com/w3c/css-houdini-drafts/issues/828<br>
&lt;TabAtkins> koji: Request from authors that want character advance information for each character of a string<br>
&lt;TabAtkins> koji: In canvas API we once tried it, but it had lots of feedback, so this is a revised proposal.<br>
&lt;TabAtkins> koji: use-case is an author with a string, they want to know caret position for drawing between each pair of characters<br>
&lt;TabAtkins> koji: Or decorations on specific characters<br>
&lt;TabAtkins> koji: This revised proposal has a .character FrozenArray with TextMetrics, a new interface we're defining.<br>
&lt;TabAtkins> koji: Each TextMetric has metrics for one grapheme cluster<br>
&lt;TabAtkins> koji: And has an index into the original string, by code unit<br>
&lt;TabAtkins> koji: Has advance, and a boolean indicating rtl vs ltr<br>
&lt;TabAtkins> koji: We've gotten some feedback on it already.<br>
&lt;TabAtkins> koji: From Myles:<br>
&lt;TabAtkins> myles_: When I read this I thought it was about caret positions, not grapheme clusters.<br>
&lt;TabAtkins> myles_: So is it one entry per grapheme cluster, or one per caret position?<br>
&lt;TabAtkins> koji: Ambiguous. Authors I talked to didn't understand the differences.<br>
&lt;TabAtkins> koji: You're right there's some subtle differences between those.<br>
&lt;TabAtkins> koji: As far as I understand the rquest, they often want the character, to draw decorations.<br>
&lt;TabAtkins> koji: Most webapps handle grapheme cluster as the minimum unit to apply stuff to.<br>
&lt;TabAtkins> myles_: It should be noted that there's a jquery plugin to get caret positions.<br>
&lt;TabAtkins> myles_: It inserts spans into the content and then gets client rects.<br>
&lt;TabAtkins> koji: I think in the long run, authors might want to different distinctions for those two. But for initial level, start with caret position.<br>
&lt;TabAtkins> myles_: So why not specific how ligatures work?<br>
&lt;TabAtkins> myles_: You say "the UA *may* produce one cluster for a ligature"<br>
&lt;TabAtkins> koji: Good point.<br>
&lt;TabAtkins> koji: Intention was, as far as we could tell, impls do slightly different things with caret positions there.<br>
&lt;TabAtkins> koji: There was feedback from someone else preferring us to say that it should match UA behavior.<br>
&lt;TabAtkins> myles_: I think that's reasonable.<br>
&lt;TabAtkins> myles_: If intention is to draw a background on a string, and it has an ffi<br>
&lt;TabAtkins> myles_: And if they want to draw it just behind the f...<br>
&lt;TabAtkins> dbaron: It seems like every UA has some behavior for carets in the middle of a ligature.<br>
&lt;TabAtkins> dbaron: I hope there's no UAs that totally put it off on one side.<br>
&lt;TabAtkins> dbaron: But even if the bheavior is different, it still seems we could expose what UAs do in that case.<br>
&lt;TabAtkins> dbaron: So you'd have a more interpoperable behavior for the number of *entries* the dev would see in the array.<br>
&lt;TabAtkins> myles_: Right. We shouldn't *fully* specify because in some situations we divide ligature evenly by number of grapheme clusters, but that's not great. We do have a native API to give us correct boundaries, we're just not using it. I'd like the flexibility to get that.<br>
&lt;TabAtkins> koji: Right, like I said earlier, the return value should match what the UA does.<br>
&lt;TabAtkins> dbaron: Right. If you do "fix", I don't want some UAs to return two entries, and other get three, just because some don't provide inter-ligature information.<br>
&lt;TabAtkins> fremy: If you have emojis that are composed of multiple chars, this API then doesn't work.<br>
&lt;TabAtkins> myles_: If your string is "e&lt;family-emoji>", the result is two entries. First is the letter "e", second is the multi-char family emoji.<br>
&lt;TabAtkins> dbaron: And same for e + combining-acute-accent. Those aren't treated like ligatures.<br>
&lt;TabAtkins> myles_: The number of entries in the result is not font-dependent, is what's important here.<br>
&lt;TabAtkins> TabAtkins: What about regional-indicators? (flags)<br>
&lt;TabAtkins> myles_: If your font doesn't support flags, you get one grapheme cluster, it just looks like a pair of characters.<br>
&lt;koji> The next feedback to discuss is "Since the clusters will be in visual order, we should determine if It’s in the direction of the base direction or if it’s always LTR. (Internally, WebKit always uses LTR, and if the base direction is RTL we do some processing to flip it around so our internal visual-order data structures are always LTR.) I’m not arguing for one or the other; just that we need to specify which way it is."<br>
&lt;TabAtkins> myles_: This should be visual order, right?<br>
&lt;TabAtkins> koji: Request is to determine advance of source string.<br>
&lt;TabAtkins> koji: So in my proposal the char is in logical order, not visual.<br>
&lt;TabAtkins> myles_: I thought you said a use-case was putting a background behind part of the string, how od you do that if it's in logical order?<br>
&lt;TabAtkins> koji: By making chars in logical order, author can determine where each character is, then the author can process it themselves.<br>
&lt;TabAtkins> myles_: Does that mean at a fragment boundary you could get a really big negative advance?<br>
&lt;TabAtkins> fremy: This was part of my feedback as well.<br>
&lt;TabAtkins> fremy: Is the advance negative in that case?<br>
&lt;TabAtkins> fremy: If you want logical order, this needs to break across bidi, or use visual order.<br>
&lt;TabAtkins> myles_: The JS i18n APIs would probably be interested in adding some APIs for this if you really want it in logical order. But I think it would be best in visual order.<br>
&lt;TabAtkins> koji: This interface has ltr vs rtl, so author can control this somewhat themselves.<br>
&lt;heycam> q+ once these current issues are finished<br>
&lt;heycam> q+ to say something once these current issues are finished<br>
&lt;TabAtkins> TabAtkins: You need to know how many chars you're formatting to do visual order, right?<br>
&lt;TabAtkins> myles_: Right, you can only reasonably call this *after* line-breaking.<br>
&lt;TabAtkins> koji: So the consensus is to use visual order, and add number of code units for each TextMetric unit.<br>
&lt;TabAtkins> myles_: You may want both "codeUnitIndex" *and* "lengthOfCluster", since it's in visual order.<br>
&lt;heycam> with https://drafts.css-houdini.org/font-metrics-api/#measure-api<br>
&lt;TabAtkins> myles_: Most important question I have is how you associate this call with a font.<br>
&lt;TabAtkins> fremy: There is the measureTExt function in the canvas api<br>
&lt;TabAtkins> myles_: Is this a new thing, or a repalcement?<br>
&lt;TabAtkins> koji: We want to sync this with the canvas api.<br>
&lt;TabAtkins> koji: We'll port this to canvas api once we agree on it.<br>
&lt;TabAtkins> heycam: So the FontMetrics API spec has a new, separate measureText function.<br>
&lt;TabAtkins> koji: Proposal is to add .characters to both FontMetrics and Canvas API.<br>
&lt;TabAtkins> fremy: So this is a mixin that will be used in both interfaces?<br>
&lt;TabAtkins> koji: Yes.<br>
&lt;Rossen> q?<br>
&lt;TabAtkins> myles_: Next feedback - unsure if this makes sense to run on an arbitrary element, since arbitrary elems can have children?<br>
&lt;TabAtkins> heycam: That's my question, yeah - what does the index count into? What about dipslay:none? etc<br>
&lt;Rossen> ack heycam<br>
&lt;Zakim> heycam, you wanted to say something once these current issues are finished<br>
&lt;TabAtkins> heycam: I think there are similar index issues with the string API. You have whitespace collapsing/trimming. Need precise definition of what indexes are used.<br>
&lt;TabAtkins> koji: I udnerstand that part isn't defined here. If we applied this to element.meausreText(), we have to define that.<br>
&lt;TabAtkins> koji: Currently the proposal only covers measuring a string.<br>
&lt;TabAtkins> koji: I'll work on a proposal to define the element case.<br>
&lt;krit> q+<br>
&lt;TabAtkins> heycam: In SVG we have a silly character-positioning API, and it's pretty annoying.<br>
&lt;TabAtkins> heycam: If there was a way to avoid all that and just stick to strings, that would be nice.<br>
&lt;TabAtkins> myles_: So I'd like to propose removing the measureElement function. Just keep it to strings for now.<br>
&lt;TabAtkins> myles_: There's more complications, like letter-spacing and such.<br>
&lt;TabAtkins> heycam: And text-transform - one character suddenly becomes two grapheme clusters, etc.<br>
&lt;TabAtkins> myles_: Another way to do it is not take StyleMap, but just a small set of properties you want to handle, like font-family and font-weight. That's what the canvas api does.<br>
&lt;TabAtkins> myles_: You can't specify font-variation, etc.<br>
&lt;TabAtkins> heycam: Ultimately it depends on the use-case.<br>
&lt;TabAtkins> heycam: If they want to measure stuff in the DOM, but they can't measure everything, maybe not useful.<br>
&lt;TabAtkins> eae: Majority of use-cases we've observed are for out-of-dom measurements.<br>
&lt;TabAtkins> Rossen: So many things we could resolve on, lot of feedback.<br>
&lt;TabAtkins> Rossen: I see a request to remove measureElement().<br>
&lt;TabAtkins> Rossen: We need to change order to visual.<br>
&lt;TabAtkins> Rossen: Add .lengthOfCluster<br>
&lt;TabAtkins> heycam: Define how whitespace collapsing, text-transform, etc that cause idfficult mappings between characters and clusters.<br>
&lt;TabAtkins> myles_: And change how ligatures and metrics interact.<br>
&lt;TabAtkins> krit: SVGWG is also looking at this problem for the counting part. At the moment svg1.1 says we should use unicode codepoints, that's not very consistent. In our investigation we found grapheme clusters aren't well-specified.<br>
&lt;TabAtkins> krit: There might be bigger issues.<br>
&lt;TabAtkins> myles_: Unicode *tries* to specify what grapheme clusters is. If that's insufficient, we have larger problems.<br>
&lt;TabAtkins> krit: We want there to be alignemtn between fontmetrics and SVG glyph counting.<br>
&lt;TabAtkins> myles_: We've been talkinga bout a number of things that need to change, but in general this is a good direction to go.<br>
&lt;TabAtkins> krit: Agree, very useful.<br>
&lt;krit> ack krit<br>
&lt;TabAtkins> Rossen: So please take feedback and reflect it into the proposal, we can discuss it over the issue in the future.<br>
&lt;TabAtkins> myles_: One more -<br>
&lt;TabAtkins> myles_: It seems totally reasonable for an author to want to use this api for things like caret positions as well as grapheme clusters.<br>
&lt;TabAtkins> myles_: I imagine this'll be extended to other segmenters in the future, so keep that in mind.<br>
&lt;TabAtkins> koji: Yeah, looking for opinions on that.<br>
&lt;TabAtkins> koji: Currently the proposal is to add an attribute, and if you want to add different segmenters, maybe make it a function?<br>
&lt;TabAtkins> koji: Or add other attributes that segment differently.<br>
&lt;TabAtkins> myles_: Will need time to think about it.<br>
&lt;iank_> ScribeNick: iank_<br>
&lt;myles_> koji: I really like how this doesn’t expose teh concept fo a glyph<br>
&lt;iank_> glazou: Thanks for greg for starting document.<br>
</details>


-- 
GitHub Notification of comment by css-meeting-bot
Please view or discuss this issue at https://github.com/w3c/css-houdini-drafts/issues/828#issuecomment-433021031 using your GitHub account

Received on Thursday, 25 October 2018 11:51:49 UTC