[whatwg] Entity parsing [trema/diaeresis vs umlaut]

I had a look at the reference page you have directed me to: it actually
states that the ISO-8859-1 character set can be used for English.  Although
my hypothesis that the word ?ovre is not English remains valid (see also the
citations in the appendix), I admit that the fact that the ligature ? is not
included in the character set (and, consequently, that the character set
ISO-8859-1 cannot be used for encoding French text, which I find kind of
stunning because of the popularity of the French language) provides a much
simpler explanation to the observable phenomenon.  My fault, I should have
checked that up first.
Best regards
Chris

APPENDIX

Other Wikipedia entries also disagree, e.g.
<http://en.wikipedia.org/wiki/%C5%92>
Borrowings into English from Latin words featuring ? are often spelled with
the letter e, especially in American English. For example, f?deral became
federal in English, while f?tus became fetus only in American English. Other
?s in English spell out as 2 separate letters oe.
<http://en.wikipedia.org/wiki/List_of_words_that_may_be_spelled_with_a_ligat
ure>
The use of the ? and ? is obsolescent in modern English, and has been used
predominantly in British English. It is usually used to evoke archaism, or
in literal quotations of historic sources.
<http://en.wikipedia.org/wiki/American_and_British_English_spelling_differen
ces#Simplification_of_ae_.28.C3.A6.29_and_oe_.28.C5.93.29>
In English, which has imported words from all three languages, it is now
usual to replace ?/? with Ae/ae and ?/? with Oe/oe.

Microsoft Word does not accept hors d'?uvre but it has no problem with hors
d'oeuvre.  The American English International keyboard does not provide a
way to type the ligature ?.  The Microsoft Encarta dictionary does not
recognize such a spelling, nor does Reference.com.
The word coeur is not mentioned in any English dictionary I know.

-----Original Message-----
From: Oistein E. Andersen [mailto:html5@xn--istein-9xa.com] 
Sent: Wednesday, June 27, 2007 11:44 PM
To: giecrilj at stegny.2a.pl; whatwg at whatwg.org
Subject: Re: [whatwg] Entity parsing [trema/diaeresis vs umlaut]

You might want to have a look at
http://pl.wikipedia.org/wiki/ISO_8859-1 .

Afterwards, consider the following:
1) Latin-1 does not contain all the characters that are required
for typesetting of English.

Received on Thursday, 28 June 2007 04:51:13 UTC