W3C home > Mailing lists > Public > www-international@w3.org > July to September 1996

Re: LANG= for character-mapping

From: Martin J Duerst <mduerst@ifi.unizh.ch>
Date: Wed, 24 Jul 1996 14:40:18 +0200 (MET DST)
To: gra@zeppo.East.Sun.COM (Gary Adams - Sun Microsystems Labs BOS)
Cc: MOURIK@rullet.LeidenUniv.nl, carrasco@innet.lu, www-international@w3.org
Message-ID: <"josef.ifi..075:24.06.96.12.40.21"@ifi.unizh.ch>
Gary Adams wrote:

>> Date: Wed, 24 Jul 1996 11:00:31 +0200 (MET DST)
>> From: "M.T. Carrasco Benitez" <carrasco@innet.lu>
>> Subject: Re: LANG= for character-mapping
>> 
>> 1) This is what I assume from the current proposals:
>> 
>> - Only one charset in allowed per document.
>
>Specifically, the HTML portion of a document (which is an SGML 
>application) is restricted to a single document character set.
>Documents have a variety of components embedded within them.
>Images, sound, executable content, etc. may have other 
>internationalization considerations as well as though needed for
>HTML rendering.

Please don't mix up the MIME "charset" parameter, which denotes
a character encoding (a mapping from an octet stream to a character
stream), and the document character set in the SGML sense, which
is a set of characters each associated with a positive integer.
A single HTML document (without the embedded components)
has a single character encoding and therefore a single MIME "charset"
parameter value when transmitted by HTTP or email. All HTML
documents have the same SGML document character set, namely
ISO 10646. (To be exact, the document character set may also
be ISO 8859-1, which is however just a subset of ISO 10646).

Regards,	Martin.
Received on Wednesday, 24 July 1996 08:40:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:45 GMT