W3C home > Mailing lists > Public > www-style@w3.org > February 2009

Re: [CSS Text/G.C. for P.M.] Hyphenation & ligatures

From: Mikko Rantalainen <mikko.rantalainen@peda.net>
Date: Tue, 24 Feb 2009 13:17:53 +0200
Message-ID: <49A3D761.5030004@peda.net>
To: www-style@w3.org
thomas wrote:
> 2009/2/23 Jonathan Kew <jonathan@jfkew.plus.com>:
>> On 23 Feb 2009, at 19:37, thomas wrote:
>>
>>> 2009/2/23 Jonathan Kew <jonathan@jfkew.plus.com>:
>>>> On 23 Feb 2009, at 15:35, thomas wrote:
>>> In most case, yes.  But suppose the font contains only ASCII real
>>> small caps, and that you need also accented small caps.  You may want
>>> not to use real small caps at all to avoid mixing them with the fake
>>> ones.
>> This is the kind of scenario where you're formatting very specifically for
>> one particular font and document, and could equally well use "font-size:
>> smaller; text-transform: uppercase;" or something like that to get the
>> effect you want. I don't think we should be complicating CSS's small-caps
>> property, for example (what, "font-variant: small-caps-real" vs
>> "font-variant: small-caps-fake"?), for the sake of something like this.
> 
> Right.  CSS sould not become over-bloated and complex.

I agree. Changing the CSS is *really* slow and it takes years and years
for the specification to be implemented in the UAs. Page author cannot
know if the font that the user has is missing the ascii letter "A", too.

Author cannot fix the missing features in the UA because the UA is not
defined until the reader comes by.

If the rendering of words in a given font is important, the best the
author can do (today and in the foreseeable future) is to provide a
downloadable font via CSS. If the UA cannot download and apply the font
correctly, there's very little hope that any other trick will result in
the correct rendering either.

>>> [...]  According
>>> to the rules of german typography, you want want no ff-ligature in
>>> "Kauffahrt"(because it is composed of "Kauf" and "Fahrt"), but you
>>> want a ff-ligature in Schiff.
>>> [...]
>> Again, U+200C ZERO WIDTH NON-JOINER is the proper code to use here, not
>> U+2060 WORD JOINER. According to the Unicode standard, ZWNJ should have no
>> effect on other algorithms such as line-breaking, but it provides the hint
>> that a potential ligature should not be formed at this point. (See The
>> Unicode Standard 5.0, p537.)
> 
> Thanks for this explanation.  I missed this point.
> 
>> I'm not saying that today's software and fonts will necessarily handle this
>> correctly, but that's how the standard says it should work, so that's what
>> we should be working towards.
>>
>> (BTW, don't you have the same problem with any other software that
>> implements automatic ligatures? How do you handle this in InDesign, TeX,
>> etc.?)
> 
> You make a good point.  High-quality typography requires anyway some
> manual fiddling.  I do not think that an automated system may ever
> replace completely the typesetter.

It is possible to provide an automated software system that does
high-quality typography. However, if the rules it needs to follow are
not universal, then every local typography "standard" requires it's own
automatic typography system.

For example, I believe that elsewhere (everywhere?) but in german
typography the ff/fi/ffi-ligature is used always regardless of word
origin (composed word or not). As such, a "high quality typography
engine" would render the text different if the output is going to german
user or to other user. (Does the ligature-ban apply to german text only
or does it apply to foreign language words typesetted in german context,
too? Perhaps the output should change depending on the input language
only, which makes a bit more sense.)

Considering that the history of printed latin scripts seems to have been
influenced greatly by the available technology (compare the rendering of
letters between high quality hand written text and typical text book,
the letters are not joined in the book in the same way as they are
joined in the hand writen text) it seems likely that in the future the
"high-quality typography" rules will converge towards universal rules
that can be typesetted correctly by software (and in the process some
will regard such universal rules as incorrect).

My mother language is Finnish, and I feel that the hyphenation rules in
e.g. Swedish are not sensible (e.g. a double consonant is trippled if
hyphenated, "bb" is hyphenated as "bb-b" and not "bb-" or "b-b"). In the
same way, I feel that the use of ligatures or not should not be affected
by the meaning of the word. Or if the use of ligature does affect the
meaning of word, it MUST NOT be encoded with the same UNICODE sequence
as the word that is not affected by the ligatures. As such, I believe
that germans should use U+200C if they believe that ligature MUST NOT be
used.

However, I understand that my opinion is biased by the culture I live in
and I do understand that Swedish or German people will feel otherwise.

If we have rules that apply universally then we can have automated
system that produces high-quality typography. On the other hand, if we
don't have rules that we can agree on, then any output that follows any
set of rules should be considered "high-quality".

-- 
Mikko


Received on Tuesday, 24 February 2009 11:18:38 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:16 GMT