Re: HTML - i18n / NCR & charsets

Francois Yergeau (yergeau@alis.com)
Tue, 26 Nov 1996 17:05:13 -0500


Message-Id: <2.2.32.19961126220513.007123a8@genstar.alis.ca>
Date: Tue, 26 Nov 1996 17:05:13 -0500
To: Misha Wolf <MISHA.WOLF@reuters.com>
From: Francois Yergeau <yergeau@alis.com>
Subject: Re: HTML - i18n / NCR & charsets
Cc: www-html <www-html@w3.org>, www-international <www-international@w3.org>,

=C0 21:35 26-11-96 -0500, Misha Wolf a =E9crit :
>If we are considering Web pages using Windows Code Pages, in which=20
>illegal numeric character references have been used for characters=20
>in the range 80-9F (decimal 128-159) then there will be no clash=20
>with anything in Unicode as these values do not represent characters=20
>in Unicode or, for that matter, in ISO 8859-X.  A permissive browser=20
>will simply map these to the expected characters.

Agreed, but this should not be construed as allowing such misuse of NCRs,
but only as being "liberal in what you accept", in good old Internet trad=
ition.

Somebody else wrote:
>But *AGAIN* I acknowledge that there _should_ be no problems, people
>should not have relied on NCRs in the low top bit range; but they have=20
>done so. And if you have easy ways of marking your pages such that you d=
o
>not break excising practice, you should do so.
>
>Dw.

If being liberal is not enough, and special marking is required, then the
broken pages with illegal NCRs should be so marked.  Doing otherwise (=E0=
 la
text/html.i18n) would seem to indicate that the correct, standard way is
special, whereas the incorrect, non-standard way is normal. Just the
opposite of what a standard means.

--=20
Fran=E7ois Yergeau <yergeau@alis.com>
Alis Technologies Inc., Montr=E9al
T=E9l : +1 (514) 747-2547
Fax : +1 (514) 747-2561