W3C home > Mailing lists > Public > www-amaya@w3.org > January to March 2003

Re: charset

From: Václav Jůza <vaclavjuza@seznam.cz>
Date: Wed, 29 Jan 2003 14:35:36 -0500 (EST)
Message-ID: <006d01c2c7cd$94f06200$4c6614d4@jza>
To: "Irene Vatton" <irene.vatton@inrialpes.fr>, <www-amaya@w3.org>

Example of the bug:
 character ' ř ':
  Unicode value: 0x159=345, in UTF-8 hex C5 99
  iso-8859-1: doesn't contain it
  Win1250(windoze system charset for central European languges): 0xF8=248
  In Amaya it is saved (doctype XHTML 1.1, but same behavior for HTML 4.01)
   as &#xF8; in "us-ascii";(!!!)
   as asc(0x3E) in iso-8859-1 (???)
   as hex C3 B8 in UTF-8), which corresponds to 0xF8
  but cp1250 is not mentioned anywhere in the document. And in iso1, and
unicode this value corresponds to 'ø'.
 So it MUST be a bug in Amaya.
 Charset us-ascii is, as I know, 7-bit
  In Mozilla Composer (for example) it is saved as
   as &#345; in x-user-defined charset and also in iso-8859-1
   as hex C5 99 in UTF-8
The problem is everywhen the character is in system charset and absent in
iso-8859-1 (ie. first 256 of Unicode)
Received on Thursday, 30 January 2003 04:25:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:27 UTC