W3C home > Mailing lists > Public > www-international@w3.org > January to March 2015

[Bug 27868] New: EUC-KR and decoding-only mapping

From: <bugzilla@jessica.w3.org>
Date: Tue, 20 Jan 2015 18:54:26 +0000
To: www-international@w3.org
Message-ID: <bug-27868-4285@http.www.w3.org/Bugs/Public/>

            Bug ID: 27868
           Summary: EUC-KR and decoding-only mapping
           Product: WHATWG
           Version: unspecified
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Encoding
          Assignee: annevk@annevk.nl
          Reporter: jshin@chromium.org
        QA Contact: sideshowbarker+encodingspec@gmail.com
                CC: mike@w3.org, www-international@w3.org

When I compared the mapping of EUC-KR in the encoding spec with ICU's
Windows-949 [1] (that was obtained by scraping *one of Windows' converters*, I
found the following differences:

1. ICU's Windows-949 mapping has 395 'decoding only' (from Unicode to
windows-949) entries for characters like 'Currency Sign cent' (U+00A2, U+00A3),
regular Latin/Greek/Cyrillic letters, and Hangul Conjoining Jamos (U+11xx),
Hangul half-width jamos (U+FFxx), enclosed CJK characters (e.g. U+32xx ) etc. 

2. ICU's Windows-949 has 190 additional round-trip mapping entries. Most of
them  (188 of them) are for the two user-defined blocks in KS X 1001 (in
EUC-KR, "C9 [A1-FE]" and "FE [A1-FE]") that are mapped to PUA code points
(U+E000 - U+E0BB). The remaining two are U+0080 and U+F8F7 mapped to 0x80 and

I don't think that we want to support the two user-defined blocks in KS X 1001.
I'm not sure about U+0080 and U+F8F7. 

However, I believe that quite many (NOT all) of 'decoding only' entries had
better be supported. 


You are receiving this mail because:
You are on the CC list for the bug.
Received on Tuesday, 20 January 2015 18:54:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:07 UTC