W3C home > Mailing lists > Public > www-international@w3.org > January to March 2011

Re: [css3-text] line break opportunities are based on *syllable* boundaries?

From: Ambrose LI <ambrose.li@gmail.com>
Date: Sat, 29 Jan 2011 15:23:43 -0500
Message-ID: <AANLkTi=HHuVZeRRScwizuGn5Wf7L-dhWEbfOMLydzpZm@mail.gmail.com>
To: John Cowan <cowan@mercury.ccil.org>
Cc: CE Whitehead <cewcathar@hotmail.com>, kojiishi@gluesoft.co.jp, addison@lab126.com, kennyluck@w3.org, www-style@w3.org, www-international@w3.org
2011/1/29 John Cowan <cowan@mercury.ccil.org>:
> CE Whitehead scripsit:
>
>> I believe it is correct to say "word" here (not "syllable"), but don't
>> know what to do about languages that do not use word deliminters,
>> and can provide no references for Korean, Japanese, or Chinese (though
>> yes a lexical resource seems best).
>
> Korean uses spaces, so it's not an issue.
>
> Chinese and Japanese don't have a problem breaking words up (as I have
> posted, the notion of "word" in Chinese is a technical linguistic one
> rather than something autonomous) and work at the character (grapheme
> cluster) level.  So line breaks are not a problem there either.

I agree in principle, but not completely. It is not entirely true that
we can break at between about any two random Chinese characters, and I
have written a bit about it at http://goo.gl/aZqxG

(Sorry about the shortened link, as I'll be changing hosts soon. And
sorry about the incoherent writing that you'll find on that page.)

> The Unicode Standard has extensive discussions of all this.




-- 
cheers,
-ambrose

www.xanga.com/little_potato | twitter.com/little_potato
Received on Saturday, 29 January 2011 20:25:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 29 January 2011 20:25:18 GMT