W3C home > Mailing lists > Public > www-international@w3.org > January to March 2015

[Bug 27878] Big5 : handling of U+5341(and potentially other dupe points) is incompatible with Firefox, Chrome and IE 11

From: <bugzilla@jessica.w3.org>
Date: Wed, 21 Jan 2015 23:44:59 +0000
To: www-international@w3.org
Message-ID: <bug-27878-4285-niocihDaPm@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27878

--- Comment #8 from Jungshik Shin <jshin@chromium.org> ---
(In reply to Philip J├Ągenstedt from comment #7)
> Have you tested all the index entries which have duplicate Unicode points. I
> currently count (grep -F '(' | awk '{print $2}' | sort | uniq -c | grep -vw
> 1 | wc -l) 100 such cases in https://encoding.spec.whatwg.org/index-big5.txt
> 
> If there are only a handful of cases where the order needs to be reversed,
> perhaps special-casing those in the encoder would be the simplest.

I skimmed over all of them and I found no other pairs.

I also looked for all the decode-only entries in windows-950-2000.ucm (ICU).
There are only 10 of them including U+5341 and U+5345. 

The following additional characters are incompatible with the encoding spec's
big5. (Firefox 35 does the same). 

U+2550
U+255E
U+2561
U+256A

They're all box-drawing characters and placed in row 0xF9 (for round-trip)
while 0xA2 positions are for decoding only. 

Other box-drawing characters are placed in row 0xA2 in Big5 for round-trip
while 0xF9 positions are for decoding only. 

I don't know if there's any logic behind this difference between two groups.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Received on Wednesday, 21 January 2015 23:45:00 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:38 UTC