- From: Philippe-Andre Prindeville <philipp@res.enst.fr>
- Date: Fri, 29 Dec 95 05:11:28 +0100
- To: Stavros Macrakis <macrakis@osf.org>, www-html@w3.org
- Cc: BearHeart@bearnet.com
On Dec 28, 12:01, Stavros Macrakis wrote: > [ ... ] > This is different from the case with "ae", "oe", etc., where the > ligature is a matter of spelling (based on etymology). The ligature > is correct in "encyclopaedia", "aesthetic", "amoeba", "Oedipus", and > "phoenix", but not in "aerosol", "aloe", "shoe", "poet", "Kafkaesque", > "maestro", "whatsoever", or "Michael". And in German words like > "Schroedinger" and "Jaeger", the digraph represents a character with > umlaut (so in fact the two-glyph sequence is actually a presentation > form of a single character). Errr.... Not quite. In Dutch, I believe, OE is not a ligature. If I remember correctly, it reflects a time (before the spelling was reformed) when things were written differently.. In Danish, certainly, AE is a different character than A - E. In French, you can write OEuil, but not mOEt, for instance. There is no hard and fast rule for knowing when you can use a digram and when you can't, either. It is rote memorization. So once the information is gone, there's no getting it back. Computers have done much to damage languages, too. In French, even high quality (and expensive, ie. $1400+) software doesn't always portray the OE or AE ligatures corectly. What do the publishers say in their own defence? "PC and Mac character sets don't represent all of these characters". Bullocks. > Conclusion: the ligatures ae and oe and the characters a-umlaut and > o-umlaut are characters, not glyphs, and belong in the character set. > The ligatures fi etc. are glyphs (presentation forms), and do not > belong in the character set. I'll go along with that. What about IJ in Dutch? Here is another issue: I was discussing IPA (Phonetics) with some linguists, and they said that if ever I should get into the rat's nest of integrating phonetics into HTML, to be sure and include "composed" or over-struck characters (like O'~ in Vietnamese). Why? Because a modern linguist often has to "invent" new notations in the field to catalog a language before new sounds can be properly dissected, annotated, etc. Overstriking is the only way to reflect a sound that is the combination or ellision of two or more sounds. But don't take my word for it. Ask a linguist. ;-) -Philip
Received on Thursday, 28 December 1995 23:12:04 UTC