Re: charset issues from Martin J. Duerst on 1996-12-11 (www-international@w3.org from October to December 1996)

From: Martin J. Duerst <mduerst@ifi.unizh.ch>
Date: Wed, 11 Dec 1996 16:52:28 +0100 (MET)
To: "Peter O.B. Mikes" <pom@llnl.gov>
cc: "J.Larmouth" <J.Larmouth@iti.salford.ac.uk>, www-international@w3.org
Message-ID: <Pine.SUN.3.95.961211164809.245H-100000@enoshima>

On Fri, 6 Dec 1996, Peter O.B. Mikes wrote:

>  Say, I want to have a text which will contain  the following fragment
> on one line:
> ----------------------------------------------------------
>       .. The German word ko"se becomes froma`ge  in French,  but  sy'r
> in Czech yet ... in  ....
> ---------------------------------------------------------------------

Just for the record: Cheese is Ka"se, not ko"se.

>     With current system of static character sets I would need a charset
> which combines all
>  Latin-1 and Latin-2 and ...
>  
>   But, if you reserve ONE special character or tag or even just an
> attribute for this, I can write this one line  like this:
>  
> --------------------------------------
>        .. The German word <font charset=German > ko"se </font> becomes
> <font charset=French> froma`ge </font>
>          but <font charset=CZ > sy'r </font> ...
> -----------------------------------------------------

This is absolutely and definitely the wrong way to do it..
There is a "reserved character", and it is "&". It starts
a numeric character reference. This will allow you to include
any Unicode character into your file. Even better, just
use Unicode for the whole file. In this case, you will
even find an editor that is able to show you the file
in a reasonable way when you edit it.

Regards,	Martin.

Received on Wednesday, 11 December 1996 10:54:32 UTC