- From: Ambrose LI <ambrose.li@gmail.com>
- Date: Sat, 29 Jan 2011 15:23:43 -0500
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: CE Whitehead <cewcathar@hotmail.com>, kojiishi@gluesoft.co.jp, addison@lab126.com, kennyluck@w3.org, www-style@w3.org, www-international@w3.org
2011/1/29 John Cowan <cowan@mercury.ccil.org>: > CE Whitehead scripsit: > >> I believe it is correct to say "word" here (not "syllable"), but don't >> know what to do about languages that do not use word deliminters, >> and can provide no references for Korean, Japanese, or Chinese (though >> yes a lexical resource seems best). > > Korean uses spaces, so it's not an issue. > > Chinese and Japanese don't have a problem breaking words up (as I have > posted, the notion of "word" in Chinese is a technical linguistic one > rather than something autonomous) and work at the character (grapheme > cluster) level. So line breaks are not a problem there either. I agree in principle, but not completely. It is not entirely true that we can break at between about any two random Chinese characters, and I have written a bit about it at http://goo.gl/aZqxG (Sorry about the shortened link, as I'll be changing hosts soon. And sorry about the incoherent writing that you'll find on that page.) > The Unicode Standard has extensive discussions of all this. -- cheers, -ambrose www.xanga.com/little_potato | twitter.com/little_potato
Received on Saturday, 29 January 2011 20:25:18 UTC