- From: fantasai <fantasai.lists@inkedblade.net>
- Date: Wed, 20 Apr 2011 20:31:52 -0700
- To: verdy_p@wanadoo.fr
- CC: CE Whitehead <cewcathar@hotmail.com>, public-i18n-core@w3.org, public-i18n-indic@w3.org, public-i18n-cjk@w3.org, unicode@unicode.org
On 04/20/2011 07:58 PM, Philippe Verdy wrote: > I disagree, because it breaks the inherent nature of the script. Joins > in Arabic are mandatory, and create "super grapheme clusters". Joins in Arabic are mandatory, and they are also broken across lines for hyphenation. > When you say that « it does not consider morphemic, syllabic, or other > boundaries », this is already wrong because it already considers the > default grapheme cluster boundaries. Note that the default grapheme > boundaries were designed only to be locale neutral. But here we are > speaking about localization where the language and its script will > matter, including in its fundamental properties. Joining types in > Arabic are key parts of the script. Which is why the joining behavior is preserved even though it is broken across lines. > But in the previous part of the specification, nothing speaks about > them, and all what is left on the upper levels where trying to find > language-correct boundaries will fail. After this level, there shoudl > still be a level related to the script itself (independantly of the > language), before trying the last-chance "emergency" breaks. This > intermediate level can still be prioritized, just as it was in the > previous steps. CSS does not prohibit such steps, but I do not think it should prescribe them in this case. That's not what this feature is for. > And yes, even in that case you could still insert the hyphenation > symbol to show that the word was effectively broken (it is common > practice to insert it, even in the Latin script and even if this is > not the preferred syllabic or morphemic break position, which can only > be infered by language specific rules and a lookup dictionnary for > handling many exception cases). "word-break: break-word" does not insert hyphens. Hyphenation is a different feature. > The hyphenation symbol is generally very narrow, and if needed, it > cans still overflow a bit in the margin. Note that overflowing even "a bit" still produces scrollbars. > The choice of the hyphenation symbol is also a property of the script. > In many East and South-East Asian scripts, there's not even any symbol > for that, because break can occur between all grapheme clusters. If you've got a pointer to resources indicating the correct hyphenation symbol for various scripts or languages, I'd be interested in linking that from the hyphenation section. :) > Note: in Indic scripts, the danda or double-danda punctuations should > be treated like the commas and stops in your spec and preferably not > left alone on the next line, even if it falls within the margin (you > showed cases for East-Asian scripts only : Han, Hiragana, Katakana, > Hangul, Bopomofo, Yi, Mongolian...) Are you talking about the rules for 'hanging-punctuation' or 'line-break' or something else? ~fantasai
Received on Thursday, 21 April 2011 03:32:36 UTC