Re: [csswg-drafts] [css-inline-3] top metrics for non-Western non-CJK writing systems with obvious top edge (#5244) from Mike Bremford via GitHub on 2020-07-28 (public-css-archive@w3.org from July 2020)

From: Mike Bremford via GitHub <sysbot+gh@w3.org>
Date: Tue, 28 Jul 2020 16:03:05 +0000
To: public-css-archive@w3.org
Message-ID: <issue_comment.created-665127983-1595952184-sysbot+gh@w3.org>

I don't think we can realistically expect the baseline table to provide this information, now or in the immediate future. Even if baselines for the world's scripts were added in the next OpenType revision, every font would need updating before they could be used. I also don't think we should be itemising every one of these baselines as a list of idents that can be set in `initial-letter-align`.

Pulling the metrics from the ink bounds of a representative glyph for the script seems like the best option - this is already proposed for Hebrew. I did some testing - first, here's the results of using "cap-height" and the alphabetic baseline for the 16 hebrew fonts at fonts.google.com, plus Noto Sans and Noto Sans Serif. This is what we'll get if we get rid of the "hebrew" keyword and fall back to the default:

![image](https://user-images.githubusercontent.com/989243/88670871-866f1c80-d0dd-11ea-8817-11f78a5fd2bb.png)

Awful. Lots of glyphs have big gaps at the top, which (in our implementation) currently causes the first line to run flush to the margin. Next, the top alignment point is taken from the horizontal center of the ink-outline of U+05BE (Maqaf) as suggested in the spec:

![image](https://user-images.githubusercontent.com/989243/88685331-bde5c500-d0ed-11ea-813e-fd9eb40d1a7b.png)

Better, but not so great. But using the ink-top in the horizontal center of U+05D4 works really well:

![image](https://user-images.githubusercontent.com/989243/88685641-0ef5b900-d0ee-11ea-9cfc-f33b5d0ffe68.png)

So I think in general the idea of pulling alignment points from glyph outlines is a good one. We're already making use of glyph outlines in `initial-letter` anyway. So long as each script uses the same mechanism (i.e. choosing the point in the horizontal (or vertical) center at the appropriate edge of the glyph outline), then adding new scripts is no more than determining which glyph is representative. After the initial implementation, that should be fairly low cost both for developers, and for anyone wanting to propose a new script.

To that end I'd suggest we consider something like `initial-letter-align: auto`, which would determine the Unicode script from the first non-common character of the text following the initial letter, exactly as we're doing now for the script _inside_ the initial-letter. We can then detail the alignment points for each script, e.g.

* Latn: top=cap-height, bottom=alpabetic baseline.
* Hebr: top=U+05D4, bottom=alphabetic baseline.
* Deva: top=BASE.hang or U+915 if not defined, bottom=alphabetic baseline.
* Beng: top=BASE.hang or U+995 if not defined, bottom=alphabetic baseline.
* Hans, Hant: top=BASE.icft or U+6C38 if not defined, bottom=BASE.icfb or U+6C38 if not defined.

If further control over which baseline to select is required (and, the more I think about this, the more I doubt it is) then perhaps something like `initial-letter-align: [alphabetic | hanging | ideographic] || [<string> <string>?]` - to let you select a baseline pair as we do now, and/or specify a glyph (or top and bottom glyphs) directly in case the baseline isn't available.

--
GitHub Notification of comment by faceless2
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/5244#issuecomment-665127983 using your GitHub account

Received on Tuesday, 28 July 2020 16:04:18 UTC