Re: [css-text] Shaping for break-all/word-break from Behdad Esfahbod on 2016-03-08 (www-style@w3.org from March 2016)

From: Behdad Esfahbod <behdad@behdad.org>
Date: Tue, 8 Mar 2016 12:17:25 +0900
To: Koji Ishii <kojiishi@gmail.com>
Cc: "www-style@w3.org" <www-style@w3.org>
Message-ID: <CAF63+7U7LTqf65NFqNrDnf-E3OVaiB-wSR=b6GGXfG4CP3UVww@mail.gmail.com>

On Tue, Mar 8, 2016 at 11:19 AM, Koji Ishii <kojiishi@gmail.com> wrote:

> Two people read the text differently, so the clarification appreciated.
>
> Both word-wrap/overflow-wrap: break-word[1] and word-break: break-all[2]
> say:
>
> > Shaping characters are still shaped as if the word were not broken.
>
> and
>
> > When shaping scripts such as Arabic are allowed to break within words
> due to break-all, the characters must still be shaped as if the word were
> not broken.
>
> One read these text as after a word was broken, each broken part is
> reshaped as if it is a word.
>

Wrong.

The other read as after a word was broken, the reshape must not occur.
>

Also wrong.

The correct intended behavior is that text needs to be reshaped after
breaking lines, period.  However, during shaping, for Arabic-like scripts,
the Unicode Arabic Joining algorithm is run to decide which form of each
Arabic character to use.  It's *this* part of the shaping that should act
"as if the word were not broken".

Most shaping engines don't support this subtle distinction, but for
example, HarfBuzz does.  When shaping a piece of text, you can pass to
HarfBuzz the surrounding text as well and it will do the right thing
regarding choosing the right forms for Arabic Joining.

behdad

> Which is correct?
>
> [1] https://drafts.csswg.org/css-text-3/#valdef-overflow-wrap-break-word
> [2] https://drafts.csswg.org/css-text-3/#valdef-word-break-break-all
>
> /koji
>

-- 
behdad
http://behdad.org/

Received on Tuesday, 8 March 2016 03:17:55 UTC