W3C home > Mailing lists > Public > www-international@w3.org > July to September 1996

Re: LANG + chars

From: Gavin Nicol <gtn@ebt.com>
Date: Thu, 25 Jul 1996 14:12:46 GMT
Message-Id: <199607251412.OAA11601@wiley.EBT.COM>
To: carrasco@innet.lu
CC: www-international@w3.org
>- Only one charset in allowed per document.

Correct.
 
>- What SHOULD be the default "document character set" for HTML ?
>  Latin1, Unicode ... ?

After the HTML I18N draft becomes a standard, the single required
document character set will be be ISO 10646. 
 
>- How should be view:
>  + Many "document character sets" are allowed; e.g., ISO-8859-1, ISO-8859-7.
>  + Only (full 32 bits) 10646 is allowed.  The others are subsets.

With rehards to Internet usage, especially taking into consideration
email, and other protocol where encoding can be transformed blindly, a
single document character set is the only thing that makes
sense. Everything else can be regarded as an encoding thereof.
 
>- The charset for transmission SHOULD be whatever is appropriate for
>the data. 

Correct.
 
>- What is appropriate for the data ?
>  The client does not express any desire/restriction and the document is in
>  the server in ISO-8859-7.  Should the server send it in ISO-8859-7 or
>  in Unicode ?

This is a server implementation issue, and should not be within
standards. My personal feeling is to lean toward UTF-8 as soon as more
browsers support it.
 
>- The server: "SHOULD or MUST ?" inform the client of the character
>set. 

Must. The client should likewise correctly label data sent to the
server. 
 
>- The server SHOULD inform the client with Content-Language.

Yes.
 
Received on Thursday, 25 July 1996 10:14:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:45 GMT