Re: [csswg-drafts] [css-text] Clarify whether soft breaks exist at boundaries of an inline element with `word-break:break-all` (#3897) from jfkthame via GitHub on 2019-05-09 (public-css-archive@w3.org from May 2019)

From: jfkthame via GitHub <sysbot+gh@w3.org>
Date: Thu, 09 May 2019 18:14:53 +0000
To: public-css-archive@w3.org
Message-ID: <issue_comment.created-491010426-1557425691-sysbot+gh@w3.org>

>     * Afaict @jfkthame's intuition is that `word-break` is applied to characters not to break opportunities, e.g. it effectively reclassifies AL as ID or vice versa, and behavior at boundaries falls out of that reclassification. This seems like a reasonable alternate way of interpreting `word-break`.

Well, it's not just "a way of interpreting", it's literally how the text explains `break-all`: "any typographic character units resolving to the NU [...], AL [...], or SA [...] line breaking classes [UAX14] are instead treated as ID". It doesn't directly explain `keep-all` in the same terms of revised line breaking classes, but that seems the obvious interpretation.

>     * However, per-character logic is not one that transfers well to `line-break`, 

I'm not sure why not? AFAICS, the kind of differences the spec describes between `strict`, `normal` and `loose` can similarly be handled by overriding the line breaking classes of the characters concerned. (As can `anywhere`, although that may be easier handled by short-circuiting the usual line breaking mechanism altogether.)

> and is definitely not something we can apply to `white-space`, consider `<nobr>中文</nobr><nobr>中文</nobr>`

ISTM `white-space` isn't really comparable to `word-break` and `line-break`. It's about white space processing (collapse, preserve, ...) rather than about identifying soft break opportunities between characters.

>     * It's important to me that behavior be symmetric. WebKit's behavior (interpreting the boundary one way at the start, and a different way at the end, of the inline) is imho unacceptable.

Agreed, that doesn't seem like a good result from any point of view.

I guess my unease is with that spec text Florian quoted, "For soft wrap opportunities defined by the boundary between two characters, the properties on nearest common ancestor of the two characters controls breaking." In my understanding, a soft wrap opportunity isn't "defined by the boundary between two characters". It *occurs* at the boundary but is *defined* as a consequence of the line-breaking classes of the characters on each side of the boundary. Those line-breaking classes are provided primarily by the Unicode standard, potentially tailored by any browser-specific customizations (many of the Unicode LB property values are informative rather than normative), and adjusted by the `word-break` and `line-break` styling of the characters.

-- 
GitHub Notification of comment by jfkthame
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/3897#issuecomment-491010426 using your GitHub account

Received on Thursday, 9 May 2019 18:14:54 UTC