W3C home > Mailing lists > Public > www-style@w3.org > October 2004

RE: Revising text wrapping, line breaking, and white space properties in CSS3 (CSS3 Text: 6 and 7)

From: Richard Ishida <ishida@w3.org>
Date: Tue, 5 Oct 2004 09:45:41 +0100
To: <www-style@w3.org>, <w3c-css-wg@w3.org>, <w3c-i18n-wg@w3.org>
Message-Id: <20041005084540.98F574F050@homer.w3.org>

Hello Fantasai,

See my first set of notes below.  These have not yet been reviewed by the i18n group.


============
Richard Ishida
W3C

contact info:
http://www.w3.org/People/Ishida/ 

W3C Internationalization:
http://www.w3.org/International/ 

Publication blog:
http://people.w3.org/rishida/blog/
 
 

> -----Original Message-----
> From: w3c-i18n-wg-request@w3.org 
> [mailto:w3c-i18n-wg-request@w3.org] On Behalf Of fantasai
> Sent: 04 October 2004 17:05
> To: w3c-css-wg@w3.org; www-style@w3.org; w3c-i18n-wg@w3.org
> Subject: Revising text wrapping, line breaking, and white 
> space properties in CSS3 (CSS3 Text: 6 and 7)
> 
> 
<snip/>
> Part I: Breaking Lines
> ======================
> 
> CSS3 Text defines the following properties to affect line 
> breaking behavior:
> 
> Property                  Origin     Controls
> --------                  ------     --------
> line-break                WinIE      Japanese line break 
> rules: strict vs. loose
> word-break-cjk                       Allowing breaks within 
> CJK and/or non-CJK
> word-break-inside                    Hyphenation
> word-break (shorthand)    WinIE      (WinIE's name for word-break-cjk)
> wrap-option               XSL:FO     Text wrapping
> 
> Line Breaking Rules
> -------------------
> 
> CSS3 Text:
>    line-break: normal | strict
>    word-break-cjk: keep-all | normal | break-all
> 
> XSL:
>    n/a
> 
> WinIE:
>    line-break: normal | strict
>    word-break: keep-all | normal | break-all
> 
> Proposed:
>    word-break: keep-all | strict | normal | break-all
> 
> Justification:
>    Practically-speaking, there's only one scale of strictness.
>                    strictest <----------------------> loosest
>    line-break     | irrelevant | strict | normal | normal    |
>    word-break-cjk | keep-all   | normal | normal | break-all |
> 
>    * normal vs. strict line-breaking is irrelevant when 
> keep-all takes effect.
>    * The combination of strict and break-all makes little sense. (Why
>      would you allow breaks in scripts like Latin, where 
> breaking words
>      in random places is wrong, but disallow breaks before small kana,
>      where breaking is merely discouraged?)

I disagree with your recommendations here.  I don't see there being one scale of strictness at all, and think that your approach is too biased towards the Western view.  

You say that breaking Latin words in random places is 'wrong', but an Asian person could say exactly the same thing about not breaking CJK when using the keep-all value. 

The word-break-cjk alternatives are not a question of scale of breakability in my mind. It relates to the application of character vs word wrapping paradigms to runs of CJK vs non-CJK text in circumstances where there is a small amount of one embedded in the other. Most of the time you will just use 'normal', but occasionally you may want to extend the current line breaking approach into the other text.


					CJK run breaks		CJK run doesn't break

non-CJK run doesn't		normal			keep all

non-CJK run breaks		break-all			-


The effects of the line-break property are a different issue, not part of the same scale of breakability: Where CJK text breaks on a character basis, do we apply the kinsoku/geumchik/other rules, and if so, to what extent.  If you look on break-all as meaning "Break embedded non-CJK text by character", you may indeed want to continue to make the distinction of kinsoku/no-kinsoku type rules to the Asian text. It's just that such rules don't apply to the non-CJK, because they are based on specific Asian characters.


I *do* have the following issues with the current CSS text:

[1] I think the names of the properties could be improved.  I think cjk-line-break would be a more informative name than line-break, since this relates, I believe, solely to cjk text. I think that word-break-cjk could be better called simply word-break.

[2] re line-break: Kinsoku rules cover much more than just splitting small kana.  I don't think this is clearly described in the text.  Nor is it clear that this is applicable to Chinese and Korean text, as well as Japanese. Nor that the kana question is not relevant in Chinese or Korean text.

[3] I think there ought to be a third alternative for line-break: none.  This would turn off line breaking restrictions.  There are occasions where I might want to do that.

[4] I think it would be easier to understand the properties on an initial read if word-break-cjk was introduced before line-break, since that provides a more top-down approach. I also think that line-break is to cjk what word-break-inside is to some non-CJK scripts, so it makes sense to explain those alongside each other.



> 
> Compatibility:
>    wrt XSL -
>      n/a
>    wrt WinIE -
>      The proposal uses the WinIE property name and values. It 
> also adds
>      a new value, 'strict', which will be ignored in WinIE. A 
> style sheet
>      can cause IE to recognize the same effect by also specifying
>      "line-break: strict".
> 
> Hyphenation
> -----------
> 
> CSS3 Text:
>    word-break-inside: none | hyphenate
> 
> XSL:
>    a slew of separate hyphenation controls, see spec
> 
> WinIE:
>    n/a
> 
> Proposed:
>    hyphenation: none | auto
> 
> Justification:
> 
>    I agree with CSS3 Text in that a simple switch is enough for CSS3.
>    However, I think the name 'word-break-inside' is really obscure,
>    and there's no need to create a shorthand for combining it with
>    word-break. So let's just call it 'hyphenation'. This also presents
>    a nicer framework for adding hyphenation limits later on.
> 


I agree with this.


Hope that helps.

RI
Received on Tuesday, 5 October 2004 08:45:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:54:34 GMT