W3C home > Mailing lists > Public > www-style@w3.org > March 2016

Re: [css-text] Shaping for break-all/word-break

From: Behdad Esfahbod <behdad@behdad.org>
Date: Tue, 8 Mar 2016 18:31:18 -0800
Message-ID: <CAF63+7UT2NarXXt+SyTUzwiiQLW68DNVqwvh83fQNSkatyygsA@mail.gmail.com>
To: Koji Ishii <kojiishi@gmail.com>
Cc: "www-style@w3.org" <www-style@w3.org>
On Tue, Mar 8, 2016 at 6:11 PM, Koji Ishii <kojiishi@gmail.com> wrote:

> Thank you Behdad for the clarification, that helps a lot.
> Can I ask one question? Let's say we have a word of 10 chars and want to
> break-all at 3.
> 1. Reshape the first 3 chars using the rest of 7 chars as text-after
> 2. Use character-to-glyph mapping of the shape result of the 10 chars and
> use glyphs for the first 3 chars.
> Do they produce different result?

Not necessarily.  And, that is even if you *can* map.  For example, if you
have a 'ffi' ligature, there's no way to break in between it without

But even if it was possible, the results are not necessarily the same.
That holds true for all scripts and languages, not just Arabic.  It's a
property of how OpenType works.  Fonts have rules that match arbitrary
sequences.  For example, a font can have a rule such that if there are five
"x" glyphs after eachother, then it will replace the middle one with an
alternate form.  This might not be a realworld example, but that's what
fonts can do, and there definitely are fonts that do similar things, in
their 'calt', Contextual Alternates, feature.  When you get to script
styles like Nastaliq, it happens ALL the time.  But then again, break-all
and caligraphy is a combination we don't have to fully support.  However, I
think pretty much any script-style Latin font will also be broken.


> 2 is what I meant not to reshape.
> If they are the same, I think we understood the same way and the spec
> looks fine.
> If they are different, different wording in the spec would help me to
> understand better.
> /koji
> On Tue, Mar 8, 2016 at 12:26 PM, Behdad Esfahbod <behdad@behdad.org>
> wrote:
>> On Tue, Mar 8, 2016 at 12:17 PM, Behdad Esfahbod <behdad@behdad.org>
>> wrote:
>>> Most shaping engines don't support this subtle distinction, but for
>>> example, HarfBuzz does.  When shaping a piece of text, you can pass to
>>> HarfBuzz the surrounding text as well and it will do the right thing
>>> regarding choosing the right forms for Arabic Joining.
>> Here is an example:
>> $ hb-shape NotoNaskhArabic-Regular.ttf --text=ب
>> [uni0628=0+1581]
>> $ hb-shape NotoNaskhArabic-Regular.ttf --text=ب --text-before=ب
>> [uni0628.fina=0+1673]
>> $ hb-shape NotoNaskhArabic-Regular.ttf --text=ب --text-after=ب
>> [uni0628.init=0+564]
>> $ hb-shape NotoNaskhArabic-Regular.ttf --text=ب --text-before=ب --text-after=ب
>> [uni0628.medi=0+599]

Received on Wednesday, 9 March 2016 02:31:48 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:09:01 UTC