W3C home > Mailing lists > Public > www-style@w3.org > February 2009

Re: [CSS Text/G.C. for P.M.] Hyphenation & ligatures

From: thomas <thomas.bsd@gmail.com>
Date: Tue, 24 Feb 2009 09:11:18 +0100
Message-ID: <2753bafa0902240011s2f765590hd3389438e4ed8ddd@mail.gmail.com>
To: Jonathan Kew <jonathan@jfkew.plus.com>
2009/2/23 Jonathan Kew <jonathan@jfkew.plus.com>:
> On 23 Feb 2009, at 19:37, thomas wrote:
>
>> 2009/2/23 Jonathan Kew <jonathan@jfkew.plus.com>:
>>>
>>> On 23 Feb 2009, at 15:35, thomas wrote:
>>>>
>>>> OK, but the font rendering mechanism needs hints. How could it know if
>>>> I want to use fake or real small caps,
>>>
>>> IMO, "font-variant: small-caps" should always use "real" small caps if
>>> they
>>> exist in the font, and only create fake ones as a fallback when using
>>> fonts
>>> that don't have real ones.
>>
>> In most case, yes.  But suppose the font contains only ASCII real
>> small caps, and that you need also accented small caps.  You may want
>> not to use real small caps at all to avoid mixing them with the fake
>> ones.
>
> This is the kind of scenario where you're formatting very specifically for
> one particular font and document, and could equally well use "font-size:
> smaller; text-transform: uppercase;" or something like that to get the
> effect you want. I don't think we should be complicating CSS's small-caps
> property, for example (what, "font-variant: small-caps-real" vs
> "font-variant: small-caps-fake"?), for the sake of something like this.

Right.  CSS sould not become over-bloated and complex.

>> OK.  Though, I doubt that hyphenating and advanced font features are
>> two completely separate questions.  Let me take a tricky example.
>> Imagine there is a property to switch ligatures on and off.  According
>> to the rules of german typography, you want want no ff-ligature in
>> "Kauffahrt"(because it is composed of "Kauf" and "Fahrt"), but you
>> want a ff-ligature in Schiff.  How do you handle the word
>> "Kauffahrteischiff"?  You could use a span inside the word to disable
>> the ligature on the first place.  But then the hyphenator would
>> consider two words, breaking the parameters "hyphenate-before" and
>> "hyphenate-after".  So you put a zero-width word-joiner between 'Kauf'
>> and 'fahrt' to prevent the ligature.  The problem is, there should be
>> a break point between 'Kauf' and 'fahrt', and the word joiner prevents
>> this.  So you add also a soft hyphen, writing
>> 'Kauf&nobreak;&shy;fahrteischiff'.  But then, what happen? Since the
>> manual hyphenation overrides the auto-hyphenation, the word will never
>> break between 'Kauffahrtei' and 'schiff' -- and it should be possible.
>
> The appropriate way to encode this would be Kauf<zwnj>fahrteischiff, and the
> presence of <zwnj> should not affect hyphenation (or perhaps should be
> explicitly included as part of the hyphenation rules).

Yes, this could be explicitly mentioned.

>> By the way, note that in the absence of CSS markup to handle
>> ligatures, Firefox puts a standard ligature (ff, fi, etc.) whenever
>> possible, thus displaying wrongly 'Kauffahrt' or 'Kaufinteresse'.  It
>> is then necessary to add manually the required word joiner and this is
>> painful.
>
> Again, U+200C ZERO WIDTH NON-JOINER is the proper code to use here, not
> U+2060 WORD JOINER. According to the Unicode standard, ZWNJ should have no
> effect on other algorithms such as line-breaking, but it provides the hint
> that a potential ligature should not be formed at this point. (See The
> Unicode Standard 5.0, p537.)

Thanks for this explanation.  I missed this point.

> I'm not saying that today's software and fonts will necessarily handle this
> correctly, but that's how the standard says it should work, so that's what
> we should be working towards.
>
> (BTW, don't you have the same problem with any other software that
> implements automatic ligatures? How do you handle this in InDesign, TeX,
> etc.?)

You make a good point.  High-quality typography requires anyway some
manual fiddling.  I do not think that an automated system may ever
replace completely the typesetter.

In fact, I was concerned by hyphenation and ligatures because CSS3
offers such a high control of typesetting that it be used to typeset a
document.  (Of course it won't replace Tex or InDesign, but for
medium-quality document typesetting, it could be OK).

@Rob: Thanks for enlightening me.  (I lack knowledge of CSS and
Unicode specifications).

++
Thomas
Received on Tuesday, 24 February 2009 08:12:03 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:16 GMT