Re: character problems

Marcus E. Hennecke (marcush@crc.ricoh.com)
Wed, 3 Jul 1996 12:26:33 -0700 (PDT)


Date: Wed, 3 Jul 1996 12:26:33 -0700 (PDT)
From: "Marcus E. Hennecke" <marcush@crc.ricoh.com>
Message-Id: <199607031926.MAA08065@cougar.crc.ricoh.com>
To: www-html@w3.org, martinmueller@nwu.edu
Subject: Re: character problems

On Wed, 03 Jul 1996 14:02:50 -0500, Martin Mueller <martinmueller@nwu.edu> wrote:
>  Such as finding a simpler way of writing sentences like "The
> German word for 'girl' is 'M&auml;dchen'," or "The French word for 'summer'
> is '&eacute;t&eacute;'."

But there are. As long as you ensure that the character set is ISO 8859-1
you can use all the characters directly without having to revert to the
entities: Mddchen, iti.

> I understand there are some solutions for this. But what if you have an
> entry form for a search?

Hmm, not sure if Netscape and MSIE correctly convert the user input to
ISO 8859-1 (probably). However, the real problem is usually that the
search engine doesn't handle the accents well. For example, if a document
contains M&auml;dchen and you type in Mddchen as keyword, then many search
engines are not able to see the equivalence.

> Will there soon be a time when what are really quite trivial problems of
> character representation will have a solution that won't turn off any but
> the most determined user?

It is far more complicated when character sets other than ISO 8859-1 are
to be allowed in forms. However, yes, people are thinking about it.

Marcus E. Hennecke
marcush@crc.ricoh.com        http://www.crc.ricoh.com/~marcush/