W3C home > Mailing lists > Public > www-style@w3.org > October 2015

Re: [css-text-3] word-break: break-all

From: 馬場孝夫 <baba@bpsinc.jp>
Date: Wed, 21 Oct 2015 22:39:41 +0900
Message-ID: <CAAWjb-eb7U4x3muK=sxtFKg4+u9RQBLFnxtkM92WkCayC68K6w@mail.gmail.com>
To: Koji Ishii <kojiishi@gmail.com>
Cc: Florian Rivoal <florian@rivoal.net>, "www-style@w3.org" <www-style@w3.org>, CJK discussion <public-i18n-cjk@w3.org>
> A. Does the current spec[1] allows UA to break between e.g., "*" when UA does not in normal breaking?

My understanding is 'yes'.

As you wrote,
- 'word-break: break-all' adds soft wrap opportunities between two
typographic *letters*.
- A 'letter' is a 'character unit' whose general category [UAX44] is
'L:Letter' or 'N:Number'.
- U+002A ASTERISK is belongs to 'P:Punctuation'.
- Therefore, 'word-break: break-all' doesn't add a soft wrap
opportunity between two asterisks.

However, regardless of 'word-break' property, there originally are
soft wrap opportunity around punctuations.

- In most languages such as Latin, punctuations are explicit word separator.
  Word boundaries are also soft wrap opportunities.
- In CJK, there are soft wrap opportunities between any two characters
(except some combinations).
- (I don't really understand for case Thai, Lao, and Khmer)

For example, in the case 'word-break: normal', there are original soft
wrap opportunities marked as '-'.

    ABC,DE**F&F
         |
         V
    ABC-,-DE-*-*-F-&-F

Of course breaking between 'C' and ',' is prohibited in the most case,
but this is due to 'line-break' property.
(Since 'line-break: anywhere' doesn't exist, most UAs prohibit break
before ',' even if 'line-break' is 'loose'.)

So I think that two asterisks should break if 'line-break: loose',
otherwise should not break. (UA dependent)

---
By the way, this understanding is not match for current browser's behaviors.
In addition, I don't think the behavior of my understanding is very
useful for CJK.

> C. Loosen it, but call out a few obvious cases of what CJK authors would expect informally.
> D. Loosen it, with explicit cases where UA should not break.

So I think C or D with adding some notes(*) related with 'line-break'
are better.

* Like "UA can refer word-break property to determine breaking rules"
to section 5.3. I haven't consider enough about this yet.

----------------------------------------------------
ビヨンド・パースペクティブ・ソリューションズ株式会社
〒160-0023
東京都新宿区西新宿6-20-7 コンシェリア西新宿TOWER'S WEST 2F
Tel: 03-6279-4320 Fax: 03-6279-4450
http://www.bpsinc.jp
馬場 孝夫(Baba Takao)


On Mon, Oct 19, 2015 at 6:48 PM, Koji Ishii <kojiishi@gmail.com> wrote:
> So, to summary:
>
> A. Does the current spec[1] allows UA to break between e.g., "*" when UA
> does not in normal breaking?
>
> If yes, I'm solved.
>
> If not, can we loosen "letters" to "characters"?
>
> B. Just loosen it, and let UA devs to make proper judge for CJK use cases.
> C. Loosen it, but call out a few obvious cases of what CJK authors would
> expect informally.
> D. Loosen it, with explicit cases where UA should not break.
>
> My understanding is "no" to A, and I'm fine with B, C, D, or open to other
> proposals if any.
>
> [1] https://drafts.csswg.org/css-text-3/#valdef-word-break-break-all
>
> /koji
>
Received on Wednesday, 21 October 2015 13:40:42 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:57 UTC