W3C home > Mailing lists > Public > public-i18n-cjk@w3.org > January to March 2011

RE: Names for line-wrapping rules in CJK

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Tue, 1 Feb 2011 06:18:45 -0500
To: "shen@cse.ust.hk" <shen@cse.ust.hk>, "Kang-Hao (Kenny) Lu" <kennyluck@w3.org>
CC: Richard Ishida <ishida@w3.org>, CJK discussion <public-i18n-cjk@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0AAF00A1DD@MAILR001.mail.lan>
It's off-topic from what Richard asked but to answer your question, Word relies on a simple table of:
* Characters that cannot appear at the top of lines
* Characters that cannot appear at the end of lines

5 tables are built in, two (normal/strict) for Japanese, one for each of Korean, Simplified Chinese, and Traditional Chinese.

That leads me to a hint for the Richard's original question. This is the help page for line-wrapping rules for Office JP:

Click the second [+], then I see、[行頭禁則文字] and [行末禁則文字] in the second bullet of item 2.

Replace "ja-jp" of the URL to "zh-cn" for Simplified Chinese:

I guess "后置标点” and “前置标点” are the one Richard is looking for.

For Traditional Chinese, use "zh-tw"

I guess [不能置於行首的字元] and [不能置於行尾的字元] are the one.

-----Original Message-----
From: public-i18n-cjk-request@w3.org [mailto:public-i18n-cjk-request@w3.org] On Behalf Of Vincent Shen
Sent: Tuesday, February 01, 2011 4:32 PM
To: Kang-Hao (Kenny) Lu
Cc: Richard Ishida; CJK discussion
Subject: Re: Names for line-wrapping rules in CJK

Can we access the rules that Microsoft Word uses for line breaks?
They seem satisfactory for most Chinese documents. They do not break right before a punctuation mark.

The Chinese term (word) normally has two or three characters, sometimes even four. It looks odd if a term is broken. But unless some parsing is done, one cannot tell the term boundaries.


> (11/02/01 2:45), Richard Ishida wrote:
>> Kinsoku shori is used to refer to line-break rules in Japanese text.
>> I believe the Korean equivalent is geumchik rules.
>> I never did know how to refer to these rules in Chinese.
> Me neither. I don't think there is a formal name defined for this, 
> partly because we don't have a document as detailed as JIS X 4051. 
> Both the literal translation of "rules for line-break" 
> (duan4han2guey1tse2) and "principles for line-break" 
> (duan4han2wuan2tse2) work for me. I can check with the Chinese speaking community (public-html-ig-zh) about this.
> However, there is a jargon, which only the publishers know, for the 
> particular part about forbidding line breaks before punctuations (part 
> of 'line-break: loose'[1]). It's called "bi4tou2dien3" (dots avoiding 
> line starts).
> [1] http://dev.w3.org/csswg/css3-text/#line-break

> Cheers,
> Kenny

Received on Tuesday, 1 February 2011 11:17:52 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:10:22 UTC