W3C home > Mailing lists > Public > public-html-ig-zh@w3.org > April 2012

Re: 臺灣和香港Big5 HKSCS vs UAO分析和結論

From: Ambrose LI <ambrose.li@gmail.com>
Date: Sat, 21 Apr 2012 16:26:13 -0400
Message-ID: <CADJvFOWwG2e3Pf6zGkpzsinrnEeG_JOOjHmin=KbT0yMt3+ELw@mail.gmail.com>
To: Philip Jägenstedt <philipj@opera.com>
Cc: Yuan Chao <yuanchao@gmail.com>, "Kang-Hao (Kenny) Lu" <kennyluck@csail.mit.edu>, Chinese HTML Interest Group <public-html-ig-zh@w3.org>
One comment first: 亂碼 are not “random characters”; they are most often
the symptom of an encoding or decoding failure, so while I have not
tried to verify Kenny’s results, I am in complete agreement that how
he attacked the problem is the correct way. (I used to have to do this
on several occasions, and the way I did it was no different than how
Kenny has done it.)

2012/4/21 Philip Jägenstedt <philipj@opera.com>:
[...]
> What should the Big5 mapping be? If it is like the conservative Big5 that
> Opera currently supports, that really won't help Taiwan sites and users at
> all. What Firefox does is also not that great, so it would have to be a new
> mapping that no browser has ever supported so far.

Personally speaking, I’d say that Big5 has always been a mess and it
is still a mess, and the only sane way to solve this problem is to
expose the underlying variants of Big5 in the encoding selection menu.
Even if some sort of statistical AI technique were used there will
still be occasions where what the machine chooses will be wrong. Just
let the user choose if something doesn’t work.

> --
> Philip Jägenstedt
> Core Developer
> Opera Software
>



-- 
cheers,
-ambrose <http://gniw.ca>
Received on Saturday, 21 April 2012 20:26:46 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:43:50 UTC