[csswg-drafts] [css-text] Allow alias for language hyphenation (#5270)

sujato has just created a new issue for https://github.com/w3c/csswg-drafts:

== [css-text] Allow alias for language hyphenation ==
The CSS spec provides for hyphenation of text, leaving the choice of language up to the UA:

https://www.w3.org/TR/css-text-4/#hyphenation

Currently Firefox offers the best support, but even they only support fairly small subset of the world's languages.

https://developer.mozilla.org/en-US/docs/Web/CSS/hyphens

The thing is, it is sometimes better to have imperfect hyphenation than none at all. No hyphenation can result in a broken UI and unreadable text, whereas imperfect hyphenation might work fine, or at worst be merely inelegant.

I work with texts in Pali and Sanskrit, which can have very long words formed by compounding. There is no browser support for hyphenation for these, nor is there likely to be. Surely these are not the only languages affected. Here is a typical example, rendered in firefox:

![Screenshot from 2020-06-30 09-26-16](https://user-images.githubusercontent.com/6112010/86071515-ccc36280-bac2-11ea-8338-b6cee8657cdf.png)

It is possible to hack around this by activating hyphens and setting `lang='la'`:

![Screenshot from 2020-06-30 09-25-58](https://user-images.githubusercontent.com/6112010/86071483-b6b5a200-bac2-11ea-906e-5b711bc018b8.png)

This is identical to the result that a proper Pali hyphenation would produce. Note that in tradition Indic orthography, there is no concept of a correct breakpoint; scribes merely wrote to the end of the line and continued on the next line. Thus the traditional practice would agree with the idea that sometimes *any* breakpoint is better than none. 

However, it's obviously not a good idea to deliberately set a false language. Hence my proposal:

**Allow the CSS to declare a language alias for hyphenation**.

So the text language is unaffected, and the HTML does not change. But the user can declare via CSS something like:

```
hyphenate-alias-languages: pli, la;
```

Meaning: "for the purpose of hyphenation, Latin and Pali may be substituted."

Such substitution would apply only if explicit support for that language is missing. So if `lang='pli' is set on the HTML, then if one UA has support for Pali hyphens, that is used, if not, it looks for support for Latin.


Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/5270 using your GitHub account

Received on Tuesday, 30 June 2020 01:34:30 UTC