W3C home > Mailing lists > Public > whatwg@whatwg.org > June 2007

[whatwg] Entity parsing [trema/diĉresis vs umlaut]

From: Ĝistein E. Andersen <html5@xn--istein-9xa.com>
Date: Sat, 23 Jun 2007 23:27:44 +0200
Message-ID: <E1I2D8y-000AII-C4@node1-3.ouvaton.local>
Sander wrote:

> Are there any char-sets that have both umlaut and trema variations of characters?

Unicode does not make the distinction, so this is somewhat unlikely.

(Personally, I tend to think that the apparent preference for umlaut dots closer
to the letter than trema dots can be linked to extrinsic phenomena like the
preference for steep accents in French typography.)

Kristof Zelechovski wrote:

> Only the vowel U can have either

This is not quite right. All Latin vowels (a, e, i, o, u, y) can take the trema/di?resis
(?, ?, ?, ?, ? in Dutch; ?, ?, ?*, ?** in French), and a, o, u can all be umlauted (?, ?, ?
in German).

Moreover, the double-dot accent also has other uses (e.g., ? and ? both designate
a stressed schwa in Luxembourgeois), so it is probably not advisable
to attempt a complete classification in HTML.

?istein E. Andersen

*) possibly only in the word capharna?m (disregarding the highly unpopular
rectifications orthographiques of 1990) and in proper names
**) only in proper names
Received on Saturday, 23 June 2007 14:27:44 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:56 UTC