Re: [css3-text] definition of 'word-break: break-all'

(12/05/10 18:05), Koji Ishii wrote:
> This brings me up a question, as you wrote below, maybe we should allow more break opportunities than the two options above. U+002B PLUS SIGN for instance has Sm category and PR line break, so neither option gives break opportunity after, but IE9 breaks before and after.

Well, I didn't say the statement in Example 4 perfectly matches what IE
does. I would be satisfied already if you link the word "letter" to 1.3
Terminology and then we can file bugs on WebKit and Gecko. The current
sentence as it is could be misinterpreted because "letter"
sometimes/often means "character" or "grapheme cluster".

> Allowing breaking all grapheme cluster boundaries except where prohibited by the line-break property seems to better fit for CJK use cases, and matches better to the current implementations. I'll discuss with fantasai to see if there were any drawbacks.

Which implementations do you mean here? WebKit and Gecko? Breaking all
grapheme cluster boundaries means that

"Yeah, I" breaks as "Y·e·a·h·,·I"
"I didn't say" breaks as "I·d·i·d·n·'·t·s·a·y"

while IE9 breaks

"Yeah, I" as "Y·e·a·h,·I"
"I didn't" breaks as "I·d·i·d·n't·s·a·y"

I have to say I like IE9 better for these rather common cases, provide
that these are embedded in CJK context.

(Speaking of examples, the 'word-break: break-all;' part of Example 4:

  # 这·是·一·些·汉·字·...

lacks a comma:

  | 这·是·一·些·汉·字,·
)

As another test case which is pretty far from a normal CJK use case, the
Thai example in the spec has no difference in IE9 when 'word-break:
break-all' is on. The further proves that IE is pretty close to that
statement in Example 4 as a Thai character is Class SA, not AL.

So the current definition seems to be in favor of the WebKit and Gecko's
direction. I don't necessarily disagree with that but I hope we have
more convergence here...

(As a suggestion, we should have our examples reflect a normal use case
like my examples above. Otherwise, you soon go into an area where
there's no much interoperability...)

>> One other thing, it seems that in Chromium18 and Firefox15a1,
>> 'word-break: break-all' really breaks all grapheme cluster boundaries, which means that,
>> for example, it breaks before the '。' in
>>
>>   測試。
>>
>> , where normally such a break is prohibited. I would like to double check with the editors
>> and folks on this list that this is indeed not conforming according to the spec.
> 
> You're right that is an implementation bug as per the current spec. Sites using break-all renders broken line breaking rules on those browsers today (one example of such site is here[1].) I was hoping to file a bug but haven't done yet.
> 
> [1] http://agora-web.jp/

Thank's for a real world example!


Cheers,
Kenny

Received on Thursday, 10 May 2012 13:02:02 UTC