W3C home > Mailing lists > Public > www-style@w3.org > February 2009

Re: [CSS Text/G.C. for P.M.] Hyphenation & ligatures

From: Jonathan Kew <jonathan@jfkew.plus.com>
Date: Mon, 23 Feb 2009 21:19:04 +0000
Cc: WWW Style <www-style@w3.org>
Message-Id: <D813330B-2022-475F-96A7-84006F2355CC@jfkew.plus.com>
To: thomas <thomas.bsd@gmail.com>
On 23 Feb 2009, at 19:37, thomas wrote:

> 2009/2/23 Jonathan Kew <jonathan@jfkew.plus.com>:
>> On 23 Feb 2009, at 15:35, thomas wrote:
>>> OK, but the font rendering mechanism needs hints. How could it  
>>> know if
>>> I want to use fake or real small caps,
>> IMO, "font-variant: small-caps" should always use "real" small caps  
>> if they
>> exist in the font, and only create fake ones as a fallback when  
>> using fonts
>> that don't have real ones.
> In most case, yes.  But suppose the font contains only ASCII real
> small caps, and that you need also accented small caps.  You may want
> not to use real small caps at all to avoid mixing them with the fake
> ones.

This is the kind of scenario where you're formatting very specifically  
for one particular font and document, and could equally well use "font- 
size: smaller; text-transform: uppercase;" or something like that to  
get the effect you want. I don't think we should be complicating CSS's  
small-caps property, for example (what, "font-variant: small-caps- 
real" vs "font-variant: small-caps-fake"?), for the sake of something  
like this.

> OK.  Though, I doubt that hyphenating and advanced font features are
> two completely separate questions.  Let me take a tricky example.
> Imagine there is a property to switch ligatures on and off.  According
> to the rules of german typography, you want want no ff-ligature in
> "Kauffahrt"(because it is composed of "Kauf" and "Fahrt"), but you
> want a ff-ligature in Schiff.  How do you handle the word
> "Kauffahrteischiff"?  You could use a span inside the word to disable
> the ligature on the first place.  But then the hyphenator would
> consider two words, breaking the parameters "hyphenate-before" and
> "hyphenate-after".  So you put a zero-width word-joiner between 'Kauf'
> and 'fahrt' to prevent the ligature.  The problem is, there should be
> a break point between 'Kauf' and 'fahrt', and the word joiner prevents
> this.  So you add also a soft hyphen, writing
> 'Kauf&nobreak;&shy;fahrteischiff'.  But then, what happen? Since the
> manual hyphenation overrides the auto-hyphenation, the word will never
> break between 'Kauffahrtei' and 'schiff' -- and it should be possible.

The appropriate way to encode this would be Kauf<zwnj>fahrteischiff,  
and the presence of <zwnj> should not affect hyphenation (or perhaps  
should be explicitly included as part of the hyphenation rules).

Even English might sometimes use this; it has been suggested that a  
word like "shelfful" is better rendered without the "ff" ligature.

> Although such complications do not come often, I do think that
> hyphenation and ligatures shall not be considered as completely
> independent questions.
> By the way, note that in the absence of CSS markup to handle
> ligatures, Firefox puts a standard ligature (ff, fi, etc.) whenever
> possible, thus displaying wrongly 'Kauffahrt' or 'Kaufinteresse'.  It
> is then necessary to add manually the required word joiner and this is
> painful.

Again, U+200C ZERO WIDTH NON-JOINER is the proper code to use here,  
not U+2060 WORD JOINER. According to the Unicode standard, ZWNJ should  
have no effect on other algorithms such as line-breaking, but it  
provides the hint that a potential ligature should not be formed at  
this point. (See The Unicode Standard 5.0, p537.)

I'm not saying that today's software and fonts will necessarily handle  
this correctly, but that's how the standard says it should work, so  
that's what we should be working towards.

(BTW, don't you have the same problem with any other software that  
implements automatic ligatures? How do you handle this in InDesign,  
TeX, etc.?)

Received on Monday, 23 February 2009 21:19:48 UTC

This archive was generated by hypermail 2.3.1 : Monday, 2 May 2016 14:38:24 UTC