Re: HTML - i18n / NCR & charsets

At 9:16p +0100 11/26/96, Dirk.vanGulik@jrc.it wrote:
>but I do insist on current practice beeing the problem.
>Doing a quick scan over all reachable pages linked in
>from the webdirectory (www.webdirectory.com) last night; I do find a
>substancial number of pages which would be broken. About 7%/4K pages.
>Of these about a fifth dates of before RFC1866.
>
>But *AGAIN* I acknowledge that there _should_ be no problems, people
>should not have relied on NCRs in the low top bit range; but they have
>done so. And if you have easy ways of marking your pages such that you do
>not break excising practice, you should do so.

Not sure what "existing practice" is, or is expected to be, but if a person
is using a certain charset specified via HTTP (or <meta http-equiv=...>),
then why would numeric charrefs be needed in the first place? The only
possible reason would be to include characters in Latin-1 or Unicode, since
the page would already have all characters in the specified charset available.
Thus, usage of NCRs for a non-Unicode/Latin1 charset makes no sense, and any
such pages deserve to break. :-)

What we really DO need are standard charset names! Netscape uses "x-mac-roman"
for the U.S. Macintosh character set, but that's obviously "experimental". We
need an official list of recommended charset names -- the file at IANA which
contains zillions of obscure aliases just doesn't fit the bill.

__________________________________________________________________________
    Walter Ian Kaye <boo@best.com>     Programmer - Excel, AppleScript,
          Mountain View, CA                         ProTERM, FoxPro, HTML
 http://www.natural-innovations.com/     Musician - Guitarist, Songwriter

Received on Tuesday, 26 November 1996 22:59:09 UTC