Re: [csswg-drafts] [css-text] Line breaking should fallback to UAX14 in absence of other rules (#5068)

This wasn't present when the mail thread you referred to was open, but now https://drafts.csswg.org/css-text-3/#text-encoding says:

> CSS is built on [UNICODE]. UAs that support Unicode must adhere to all normative requirements of the Unicode Core Standard, except where explicitly overridden by CSS.

This means that UAX14 is to be followed, as much as any other part of unicode is. Most parts of UAX14 are tailorable, which is effectively the same as a "SHOULD" requirement. I don't think we should go any further than that, for the reasons argued by fantasai in the mail thread (note that it already goes further than what she was willing to accept back then)

In other words, at a normative level, the spec already does what you are requesting.

This should be sufficient grounds to file write tests into wpt (with the "should" flag) and file bugs against browsers when they fail those tests, although we should expect that some of these bugs may be closed as WONTFIX when browsers have a good reason to deviate from UAX14.

That said, when doing this, recognizing Koji's point in the thread about web compat, and fantasai's point about UAX14 being only a baseline that in many cases ought to be tailored, I would recommend focusing on situations which are known to be problematic, rather than just writing exhaustive checklists for all code points: deviating from UAX14 can be justified for web-compat reasons or because of desirable tailorings, but since browser code was historically not based on UAX14, not all differences are documented, and finding out whether a difference is accidental or intentional can be expensive. Bearing that cost in cases known to be problematic is justified, but starting with an assumption that all divergences are bad is likely to result in a lot more work than actually desirable.

On an editorial level, maybe we can make things a little more obvious. How about rephrasing the note at the end of https://drafts.csswg.org/css-text-3/#line-breaking from:

> Further information on line breaking conventions can be found in [JLREQ] and [JIS4051] for Japanese, [CLREQ] and [ZHMARK] for Chinese, and in [UAX14] for all scripts in Unicode. See also the Internationalization Working Group’s Typography Index [TYPOGRAPHY] which includes more information on additional languages.

to

> [UAX14] defines a baseline behavior for line breaking for all scripts in Unicode, which is expected to be further tailored. Further information on line breaking conventions can be found in [JLREQ] and [JIS4051] for Japanese, [CLREQ] and [ZHMARK] for Chinese. See also the Internationalization Working Group’s Typography Index [TYPOGRAPHY] which includes more information on additional languages.


-- 
GitHub Notification of comment by frivoal
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/5068#issuecomment-633466971 using your GitHub account

Received on Monday, 25 May 2020 09:07:04 UTC