Re: 臺灣和香港Big5 HKSCS vs UAO分析和結論

On Sat, 21 Apr 2012 22:26:13 +0200, Ambrose LI <ambrose.li@gmail.com>  
wrote:

> One comment first: 亂碼 are not “random characters”; they are most often
> the symptom of an encoding or decoding failure, so while I have not
> tried to verify Kenny’s results, I am in complete agreement that how
> he attacked the problem is the correct way. (I used to have to do this
> on several occasions, and the way I did it was no different than how
> Kenny has done it.)

You are of course right, perhaps I've used 亂碼 too broadly, I've used it  
to mean something like "unrecoverable misencoding or typo". The comment  
about "random characters" was meant from the users point of view, of  
course the "迳" in <http://www.wintan.com.tw/service_06_08.htm> is not the  
result of any random process, just encoding mismatch.

> 2012/4/21 Philip Jägenstedt <philipj@opera.com>:
> [...]
>> What should the Big5 mapping be? If it is like the conservative Big5  
>> that
>> Opera currently supports, that really won't help Taiwan sites and users  
>> at
>> all. What Firefox does is also not that great, so it would have to be a  
>> new
>> mapping that no browser has ever supported so far.
>
> Personally speaking, I’d say that Big5 has always been a mess and it
> is still a mess, and the only sane way to solve this problem is to
> expose the underlying variants of Big5 in the encoding selection menu.
> Even if some sort of statistical AI technique were used there will
> still be occasions where what the machine chooses will be wrong. Just
> let the user choose if something doesn’t work.

Right, that will likely be required if we end up with more than one Big5  
variant.

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Sunday, 22 April 2012 07:41:26 UTC