- From: <bugzilla@jessica.w3.org>
- Date: Wed, 21 Jan 2015 23:44:59 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27878 --- Comment #8 from Jungshik Shin <jshin@chromium.org> --- (In reply to Philip Jägenstedt from comment #7) > Have you tested all the index entries which have duplicate Unicode points. I > currently count (grep -F '(' | awk '{print $2}' | sort | uniq -c | grep -vw > 1 | wc -l) 100 such cases in https://encoding.spec.whatwg.org/index-big5.txt > > If there are only a handful of cases where the order needs to be reversed, > perhaps special-casing those in the encoder would be the simplest. I skimmed over all of them and I found no other pairs. I also looked for all the decode-only entries in windows-950-2000.ucm (ICU). There are only 10 of them including U+5341 and U+5345. The following additional characters are incompatible with the encoding spec's big5. (Firefox 35 does the same). U+2550 U+255E U+2561 U+256A They're all box-drawing characters and placed in row 0xF9 (for round-trip) while 0xA2 positions are for decoding only. Other box-drawing characters are placed in row 0xA2 in Big5 for round-trip while 0xF9 positions are for decoding only. I don't know if there's any logic behind this difference between two groups. -- You are receiving this mail because: You are on the CC list for the bug.
Received on Wednesday, 21 January 2015 23:45:00 UTC