Re: [csswg-drafts] [css-text] Writing System prose is currently unimplementable on ICU (#4445)

Contrary to the title of this issue, I think it's actually possible for a UA that wants to rely on ICU line-breaking to implement this part of the spec pretty well, even without ICU being enhanced to respect the script subtag. I believe something along these lines should give the desired result:

* check if the locale code from the `lang` attribute includes a script subtag
   - if not, it's fine to use the locale code as-is to control the line-breaker
   - example: `ja` or `ja-JP` has no script subtag, just use it
   - example: `ja-Latn` does include a script subtag, so we go to the next step

* call `minimizeSubtags()` on the locale code
   - if this removes the script subtag, then it was the "expected" script and it's fine to use the code as-is
   - example: `en-Latn` or `zh-Hans` will minimize to `en` or `zh` respectively, so are safe to use
   - example: `ja-Latn` will be unchanged (or return `ja-Latn-JP`? I'm not sure, but either is OK), so we go to the next step as the script code is still present

* there's a potential mismatch: the `lang` attribute is specifying a script that is not the "normal" writing system for the given language, and ICU line-breaking would behave incorrectly
   - generate a new locale code by passing "und-<script>" to `addLikelySubtags` and use the resulting code instead of the `lang` from the page to control the line-breaker
   - example: replace `ja-Latn` with `addLikelySubtags("und-Latn")` which will be `en-Latn-US`, which will result in Latin-appropriate line breaking


-- 
GitHub Notification of comment by jfkthame
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/4445#issuecomment-578354230 using your GitHub account

Received on Saturday, 25 January 2020 00:33:25 UTC