Re: HTML - i18n / NCR & charsets

Walter Ian Kaye (
Tue, 26 Nov 1996 19:42:16 -0800

Message-Id: <v0300780aaec1659ae557@[]>
In-Reply-To: <>
Date: Tue, 26 Nov 1996 19:42:16 -0800
From: Walter Ian Kaye <>
Subject: Re: HTML - i18n / NCR & charsets

At 9:16p +0100 11/26/96, wrote:
>but I do insist on current practice beeing the problem.
>Doing a quick scan over all reachable pages linked in
>from the webdirectory ( last night; I do find a
>substancial number of pages which would be broken. About 7%/4K pages.
>Of these about a fifth dates of before RFC1866.
>But *AGAIN* I acknowledge that there _should_ be no problems, people
>should not have relied on NCRs in the low top bit range; but they have
>done so. And if you have easy ways of marking your pages such that you do
>not break excising practice, you should do so.

Not sure what "existing practice" is, or is expected to be, but if a person
is using a certain charset specified via HTTP (or <meta http-equiv=...>),
then why would numeric charrefs be needed in the first place? The only
possible reason would be to include characters in Latin-1 or Unicode, since
the page would already have all characters in the specified charset available.
Thus, usage of NCRs for a non-Unicode/Latin1 charset makes no sense, and any
such pages deserve to break. :-)

What we really DO need are standard charset names! Netscape uses "x-mac-roman"
for the U.S. Macintosh character set, but that's obviously "experimental". We
need an official list of recommended charset names -- the file at IANA which
contains zillions of obscure aliases just doesn't fit the bill.

    Walter Ian Kaye <>     Programmer - Excel, AppleScript,
          Mountain View, CA                         ProTERM, FoxPro, HTML     Musician - Guitarist, Songwriter