Re: Text on character encoding: single platfom vs web from Jeremy Carroll on 2005-10-11 (public-i18n-geo@w3.org from October 2005)

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Tue, 11 Oct 2005 14:12:16 +0100
To: Martin Duerst <duerst@it.aoyama.ac.jp>
CC: public-i18n-geo@w3.org
Message-ID: <434BBA30.60106@hplb.hpl.hp.com>

Martin Duerst wrote:
> Hello Jeremy,
> 
> I think this text could be very helpful with a bit of work.
> 
> However, I think the problem is that it assumes that each platform
> has exactly one platform encoding, and only that one should ever
> be used. 

The problem that the text was initially addressing, was that we had been 
shipping code for a couple of years that had encouraged people to read 
and write their RDF/XML files in the encoding that their Java session 
started up in, and the fix involved pursuading our users to change their 
coding idioms. Our tutorial information was incorrect and had to be 
fixed. We had a user base who had read the incorrect tutorial etc. etc.

I think Martin's points are valid, but perhaps lead away from the key 
point of my text which is that:
- for output to the Web
  UTF-8 is the best encoding
- for input from the Web
you cannot assume any encoding but should do your best with whatever you 
find.
This may differ from the way you make encoding choices for your local 
system.

Jeremy

Received on Tuesday, 11 October 2005 13:12:41 UTC