Re: "fi" ligatures etc.

Philippe-Andre Prindeville (philipp@res.enst.fr)
Fri, 29 Dec 95 05:11:28 +0100


Date: Fri, 29 Dec 95 05:11:28 +0100
From: Philippe-Andre Prindeville <philipp@res.enst.fr>
Message-Id: <9512290511.ZM6697@jones.res.enst.fr>
In-Reply-To: Stavros Macrakis <macrakis@osf.org>
To: Stavros Macrakis <macrakis@osf.org>, www-html@w3.org
Subject: Re: "fi" ligatures etc.
Cc: BearHeart@bearnet.com

On Dec 28, 12:01, Stavros Macrakis wrote:
> [ ... ]
> This is different from the case with "ae", "oe", etc., where the
> ligature is a matter of spelling (based on etymology).  The ligature
> is correct in "encyclopaedia", "aesthetic", "amoeba", "Oedipus", and
> "phoenix", but not in "aerosol", "aloe", "shoe", "poet", "Kafkaesque",
> "maestro", "whatsoever", or "Michael".  And in German words like
> "Schroedinger" and "Jaeger", the digraph represents a character with
> umlaut (so in fact the two-glyph sequence is actually a presentation
> form of a single character).

Errr.... Not quite.

In Dutch, I believe, OE is not a ligature.  If I remember correctly,
it reflects a time (before the spelling was reformed) when things
were written differently..  In Danish, certainly, AE is a different
character than A - E.  In French, you can write OEuil, but not
mOEt, for instance.  There is no hard and fast rule for knowing
when you can use a digram and when you can't, either.  It is rote
memorization.

So once the information is gone, there's no getting it back.

Computers have done much to damage languages, too.  In French,
even high quality (and expensive, ie. $1400+) software doesn't
always portray the OE or AE ligatures corectly.  What do the
publishers say in their own defence?  "PC and Mac character
sets don't represent all of these characters".

Bullocks.

> Conclusion: the ligatures ae and oe and the characters a-umlaut and
> o-umlaut are characters, not glyphs, and belong in the character set.
> The ligatures fi etc. are glyphs (presentation forms), and do not
> belong in the character set.

I'll go along with that.  What about IJ in Dutch?

Here is another issue:  I was discussing IPA (Phonetics) with
some linguists, and they said that if ever I should get into the
rat's nest of integrating phonetics into HTML, to be sure and include
"composed" or over-struck characters (like O'~ in Vietnamese).  Why?
Because a modern linguist often has to "invent" new notations in
the field to catalog a language before new sounds can be properly
dissected, annotated, etc.  Overstriking is the only way to
reflect a sound that is the combination or ellision of two or more
sounds.

But don't take my word for it.  Ask a linguist.  ;-)

-Philip