Re: [csswg-drafts] [css-text] Need additional value of word-break for Korean (#4285)

### Implementation Concern

The fact that this kind of line-breaking isn't implemented anywhere in any line-breaking utility library is concerning. It's difficult to believe that CSS is the first place where software engineers have ever wanted this line breaking behavior. I'd like to discuss this with the ICU maintainers to get their thoughts about this.

### Proposal

Adding a new Hangul-specific keyword seems like the wrong design to solve this problem because the values don't stack. It's unlikely that Korean is the only language with two `normal`-style behaviors. Instead, if we wanted to add script-specific behaviors for languages with two `normal`-style behaviors, we probably would want to do it with language-specific customizations.

Perhaps something like:

```css
word-break: normal customization(Kore, keep-all)
```

which would mean "`Kore` content uses `keep-all` but everything else uses `normal`." The first argument would be a ISO 15924 script name, _not_ a `lang` tag, because this information has to be determined from the raw characters, rather than an out-of-band annotation like `lang`.

This way, the customizations are stackable: for languages with multiple `normal`-style behaviors, an author can say "I want normal2 for Korean and normal4 for this other language" when we get around to adding support for customizing that other language.

### Limiting Expressiveness

The intent for this proposal is only to select which of the `normal`-style values should be applied for scripts which have multiple `normal`-style values. It isn't to select arbitrary line breaking behavior for arbitrary scripts. Therefore, it's important to limit the expressiveness of this proposal to just the cases that actually make sense. In order to limit its expressiveness, either browsers or the spec could list a set of scripts that are accepted here, and this set would initially just contain a single item. If we limit the expressivity in this way, ICU and other line breaking utilities can have flexibility to implement this feature in any way, and browsers don't need lots of custom line breaking code.

### Alternative Considered

This proposal could also use unicode blocks instead of unicode scripts, though that would require something like `customization(HANGUL_JAMO, ...) customization(HANGUL_COMPATIBILITY_JAMO, ...) customization(HANGUL_SYLLABLES, ...) customization(HANGUL_JAMO_EXTENDED_A, ...) customization(HANGUL_JAMO_EXTENDED_B, ...) ` in order to get all of Korean.

-- 
GitHub Notification of comment by litherum
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/4285#issuecomment-542809137 using your GitHub account

Received on Wednesday, 16 October 2019 17:31:32 UTC