RE: [Moderator Action] Using unicode in CGI programs from Rolfe, Russell D, ALSVC on 2000-02-10 (www-international@w3.org from January to March 2000)

From: Rolfe, Russell D, ALSVC <rrolfe@att.com>
Date: Thu, 10 Feb 2000 16:52:28 -0500
To: "'erik@netscape.com'" <erik@netscape.com>, lomen@hanimail.com
Cc: www-international@w3.org
Message-ID: <E5B80B001D76D211879C00E0291077610298C740@njc240po05.ho.att.com>

Erik,

Thanks for the references.

Also the webtool is cool!  8^)>

Regards, Russ

-----Original Message-----
From: erik@netscape.com [mailto:erik@netscape.com]
Sent: Tuesday, February 08, 2000 8:10 PM
To: lomen@hanimail.com
Cc: www-international@w3.org
Subject: Re: [Moderator Action] Using unicode in CGI programs

lomen@hanimail.com wrote:
>
> I am making CGI programs that print Unicode html text.
> 
> If using unicode, is any problem?

Many users are still using Netscape 4.X, which has problems with Unicode
when the language is one that cannot be presented using Times and
Courier. See the attached message.

> For example, "0xfeff" character or "Content-type:text/html\n\n"?

Putting the "BOM" (0xFEFF) at the beginning of the HTML document (after
the HTTP response headers) is a good idea. The BOM is used by the
browsers to auto-detect Unicode.

It is also a good idea to add the charset parameter to your Content-Type
header:

  Content-Type: text/html; charset=ISO-10646-UCS-2

(By the way, does anyone know the status of the UTF-16 registration?)

Keep in mind that the "Content-Type" header and all of the other HTTP
response headers must be in ASCII (i.e. single byte, not double byte),
even if the HTML document itself is in Unicode. You can see an example
of a Unicode page here:

  http://www.fxis.co.jp/DMS/sgml/xml/charset/utf-16/utf16-be-dos.html

Try copying and pasting the above URL into my HTTP/HTML source viewer:

  http://webtools.mozilla.org/web-sniffer/

Erik

Received on Thursday, 10 February 2000 16:53:25 UTC