- From: Koji Ishii <kojiishi@gmail.com>
- Date: Thu, 15 Oct 2015 01:33:28 +0900
- To: "www-style@w3.org" <www-style@w3.org>
- Cc: CJK discussion <public-i18n-cjk@w3.org>
Several months ago, Blink changed the implementation of "word-break: break-all"[1] to as the spec defines: may break between any two typographic letter units This value is, as written in the spec, designed to be easy to implement without sacrificing CJK line break rules, since we believed its primary use is in CJK. However, since our change, I hear that it does not work as expected from Latin and other non-CJK authors such as Arabic, and Blink is the only browser that is broken. Examples I've got are to expect to break anywhere in "AT&T" or "*****", and Trident/Gecko/WebKit all break these strings. So I'd like to propose to change the spec so that it can serve both CJK and non-CJK usages, and is more interoperable with existing implementations. I checked the behavior for ASCII code points here[2], but in short: Trident/Edge: Breaks almost anywhere except before closing parenthesis, period, etc. "&" and "*" in the examples above can break before and after. Gecko/WebKit: Breaks anywhere. Since what Gecko/WebKit does is quite unfortunate for CJK, I'm thinking to be similar to what Trident/Edge does. As far as I can see from ASCII code range, the rules are: * Not break before !"'),./:;?]} * Not break after "$'(-[\{ So by translating them to UAX#14 Line Breaking classes, rules would be: * Not break before EX, QU, CP, IS, SY * Not break after QU, PR, OP, HY, PR I think I'll need to check side-effects and Trident/Edge behavior a little more in details, but would appreciate opinions/feedback if any. [1] https://drafts.csswg.org/css-text-3/#valdef-word-break-break-all [2] http://kojiishi.github.io/playgrounds/line-break-matrix/?word-break=break-all /koji
Received on Wednesday, 14 October 2015 16:34:17 UTC