W3C home > Mailing lists > Public > whatwg@whatwg.org > June 2007

[whatwg] ISO-8859-* and the C1 control range

From: Řistein E. Andersen <html5@xn--istein-9xa.com>
Date: Tue, 05 Jun 2007 19:40:34 +0200
Message-ID: <E1Hvd1G-0004eb-Kl@node1-2.ouvaton.local>
Neither "ISO-8859-11" nor "Windows-874" appears in the list
of IANA-approved character sets:
    http://www.iana.org/assignments/character-sets
On the other hand, "TIS-620" (identical to ISO-8859-11
except that 0xA0 is left undefined) has been sanctioned by IANA.
Perhaps Henri Sivonen could add a test for TIS-620?

(To do this properly, what we really ought to do is look for
C1 and undefined characters in all IANA charsets and semi-official
mappings to Unicode and check 1) whether the gaps can be filled
by borrowing from other encodings, and 2) whether browsers
actually do so. It would probably be acceptable to require
specific treatment for ISO-8859-1 bytes, given the encoding's
special status and the fact that NCRs need this treatment anyway,
but it seems difficult to defend exceptions for one Thai encoding
without actually investigating whether similar measures might
be appropriate for other encodings as well.)

-- 
?istein E. Andersen
Received on Tuesday, 5 June 2007 10:40:34 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:56 UTC