[whatwg/encoding] Confusion between KOI8-U and KOI8-RU encodings (#74) from Bruno Haible on 2016-10-03 (public-webapps-github@w3.org from October 2016)

From: Bruno Haible <notifications@github.com>
Date: Mon, 03 Oct 2016 14:14:57 -0700
To: whatwg/encoding <encoding@noreply.github.com>
Message-ID: <whatwg/encoding/issues/74@github.com>

The current draft maps the labels "koi8-u" and "koi8-ru" to a single encoding, and the mapping table that it uses (index-koi8-u.txt) does not match either of the widely used mapping tables for KOI8-U and KOI8-RU.

In detail:

Glibc and other software consider KOI8-U and KOI8-RU to be different.
KOI8-U was defined through an RFC, see https://en.wikipedia.org/wiki/KOI8-U.
KOI8-RU was defined through a draft RFC that was never finalized, see https://www.terena.org/activities/multiling/koi8-ru/

The mapping in index-koi8-u.txt differs from the common KOI8-U mapping at the positions 0xAE, 0xBE.

The mapping in index-koi8-u.txt differs from the common KOI8-RU mapping at the positions 0x93, 0x96..0x99, 0x9B..0x9D, 0x9F.

For details about the mapping tables, see
http://haible.de/bruno/charsets/conversion-tables/index.html
http://haible.de/bruno/charsets/conversion-tables/KOI8-U.html
http://haible.de/bruno/charsets/conversion-tables/KOI8-RU.html

Do you have data about who actually uses KOI8-U and KOI8-RU and how?
Is it necessary to deal with these two encodings at all?


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/74

Received on Monday, 3 October 2016 21:15:32 UTC