[Prev][Next][Index][Thread]
Re: LANG + chars
>- Only one charset in allowed per document.
Correct.
>- What SHOULD be the default "document character set" for HTML ?
> Latin1, Unicode ... ?
After the HTML I18N draft becomes a standard, the single required
document character set will be be ISO 10646.
>- How should be view:
> + Many "document character sets" are allowed; e.g., ISO-8859-1, ISO-8859-7.
> + Only (full 32 bits) 10646 is allowed. The others are subsets.
With rehards to Internet usage, especially taking into consideration
email, and other protocol where encoding can be transformed blindly, a
single document character set is the only thing that makes
sense. Everything else can be regarded as an encoding thereof.
>- The charset for transmission SHOULD be whatever is appropriate for
>the data.
Correct.
>- What is appropriate for the data ?
> The client does not express any desire/restriction and the document is in
> the server in ISO-8859-7. Should the server send it in ISO-8859-7 or
> in Unicode ?
This is a server implementation issue, and should not be within
standards. My personal feeling is to lean toward UTF-8 as soon as more
browsers support it.
>- The server: "SHOULD or MUST ?" inform the client of the character
>set.
Must. The client should likewise correctly label data sent to the
server.
>- The server SHOULD inform the client with Content-Language.
Yes.
References:
- LANG + chars
- From: "M.T. Carrasco Benitez" <carrasco@innet.lu>