W3C home > Mailing lists > Public > www-style@w3.org > May 2012

RE: [css3-text] definition of 'word-break: break-all'

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Thu, 10 May 2012 10:55:32 -0400
To: "Kang-Hao (Kenny) Lu" <kennyluck@csail.mit.edu>
CC: WWW Style <www-style@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0D3C3C832E@MAILR001.mail.lan>
> "Yeah, I" breaks as "Y·e·a·h·,·I"
> "I didn't say" breaks as "I·d·i·d·n·'·t·s·a·y"
> while IE9 breaks
> "Yeah, I" as "Y·e·a·h,·I"
> "I didn't" breaks as "I·d·i·d·n't·s·a·y"
> I have to say I like IE9 better for these rather common cases, provide that these are
> embedded in CJK context.

I like IE9 too.

My idea was to allow breaks at first place, but comma and other such characters should be covered by the line-break property, so it covers IE9 behavior (if we can assume the UA has correct set of line-break rules.) That has a risk of interoperability issue because the rules of line-break are UA dependent though.

> (Speaking of examples, the 'word-break: break-all;' part of Example 4:
>   # 这·是·一·些·汉·字·...
> lacks a comma:

Fixed, thank you.

> As another test case which is pretty far from a normal CJK use case, the Thai example in
> the spec has no difference in IE9 when 'word-break:
> break-all' is on. The further proves that IE is pretty close to that statement in Example 4
> as a Thai character is Class SA, not AL.

SA is defined as:

| Therefore complex context analysis, often involving dictionary lookup of some form,
| is required to determine non-emergency line breaks. If such analysis is not available,
| it is recommended to treat them as AL.

So SA=AL if you haven't installed dictionary. But UAX#14 has a lot of such re-assignment rules and is complex that I'm a little nervous to follow example 4, which requires more complex analysis than example 3. I guess I need a little more time to analyze its impact.

> So the current definition seems to be in favor of the WebKit and Gecko's direction. I don't
> necessarily disagree with that but I hope we have more convergence here...

It depends on what to have in the line-break rules. How much we can depend on the rules and how much we should define in break-all needs more thoughts and discussions.

Received on Thursday, 10 May 2012 14:56:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:16 UTC