Re: LANG attribute

=?iso-8859-1?Q?Martin_J=2E_D=FCrst?= (mduerst@ifi.unizh.ch)
Sat, 25 Oct 1997 16:46:34 +0100 (MET)


Date: Sat, 25 Oct 1997 16:46:34 +0100 (MET)
From: =?iso-8859-1?Q?Martin_J=2E_D=FCrst?= <mduerst@ifi.unizh.ch>
To: Tim Bagot <timothy.bagot@keble.oxford.ac.uk>
cc: HTML mailing list <www-html@w3.org>
In-Reply-To: <Pine.OSF.3.96.971025151220.22686B-100000@sable.ox.ac.uk>
Message-ID: <Pine.SUN.3.96.971025164122.245A-100000@enoshima.ifi.unizh.ch>
Subject: Re: LANG attribute

On Sat, 25 Oct 1997, Tim Bagot wrote:

> Looking at the HTML 4.0 draft specification, I noticed that while it is
> possible to specify another language for part of a document, there does
> not seem to be a way to specify a character set encoding at the same time. 
> This means that if the language in question is incompatible with the
> document's encoding then (presumably) character entity references must be
> employed instead. An additional attribute, or perhaps an extension to the
> existing one, would make life a bit easier.

Hello Tim,

This was considered when we worked all these things out. The problem
is that there virtually no editors that can handle raw text files that
change their encoding midway, and that it would complicate many things
much more than that would be gained. Also, it would go against the
basic Internet and MIME models and modes of operation.

You can use character entity references, for those characters where
they are defined, and you can use numeric character references
(decimal, and new for HTML 4.0 also hexadecimal) for all characters
in ISO 10646/Unicode. Also, you can use ISO 10646/Unicode directly,
and then you don't have any need anymore for switching character
encodings.

Regards,	Martin.