W3C home > Mailing lists > Public > public-html-ig-zh@w3.org > April 2012

Big5-UAO and Big5-2003

From: Philip Jägenstedt <philipj@opera.com>
Date: Sun, 22 Apr 2012 22:00:35 +0200
To: "public-html-ig-zh@w3.org" <public-html-ig-zh@w3.org>
Message-ID: <op.wc6m69c1sr6mfa@localhost.localdomain>
To better understand the origins of Big5-UAO, I've taken a closer look at  
the last release of Unicode 補完計畫, 2.50 from 2006.[1] The link on  
cpatch.org don't work, but one can still find it elsewhere.[2] Installing  
it on Windows 7 doesn't work, so I followed a blog post [3] explaining how  
to manually extract the cp950. This does make the UAO test page [4] work  
in IE. (Note that the test page uses Japanese, which seems to be the main  
use for UAO.)

For ease of reference, I've added both the binary and all extracted files  
to big5.git. Among other interesting things, it contains a B52UC.TXT  
mapping [5] that I assume exactly matches the implementation. Comparing  
that mapping to Firefox's Big5 revealed that they the same except for 10  
purely cosmetic differences.[6] This means that the problem with bogus  
Chinese characters [7] affects the original UAO as well, and I checked a  
few sites in IE to verify that this is the case.

I have no idea how widespread Unicode 補完計畫 once was, but given the  
crazy steps to make it work in Windows 7 I'd be very surprised if it's  
still relevant. I'm skeptical that the UAO content can be automatically  
detected and am not a fan of locale-specific defaults, so I won't be  
suggested any spec changes to support UAO. Perhaps someone else will...

As for Big5-2003, I've compared it with some other mappings: [8]

2003 vs Opera: 5826 differences
2003 vs Chrome: 455 differences
2003 vs Firefox: 5803 differences
2003 vs IE: 484 differences
2003 vs Spec: 5977 differences

In other words, it's not very close to what any browser supports. To make  
matters worse, 5809 of the mappings use PUA, so it would be lots of work  
to figure out a proper mapping. All in all, it doesn't seem worth even  
trying with Big5-2003, since it has never worked on the Web.

[1] http://www.cpatch.org/thread-6377-1-1.html
[3] http://heartfullmoon.blogspot.se/2009/10/windows-7-unicode-250.html
[7] http://lists.w3.org/Archives/Public/public-html-ig-zh/2012Apr/0046.html

Philip Jägenstedt
Core Developer
Opera Software
Received on Sunday, 22 April 2012 20:01:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 15:46:35 UTC