Re: 有關 PTT 上的 big5

Hi all,

I am afraid that I am not capable or able to answer all of these questions.
But I will try my best.

在 2012年4月13日下午4:44,Kang-Hao (Kenny) Lu <> 寫道:
> 哈囉 挺宇
> Opera 的 Anne van Kesteren 最近在寫一份瀏覽器的編碼如何處理的規範[1]。
> Opera 的阿菲(Cc 列上的 Philip Jägenstedt)和 Anne 有提到或許把瀏覽器碰
> 到 <meta charset=big5> 的內容以 'big5-hkscs' 解碼這個可能性,但是由於
> 'big5-hkscs' 和「Unicode 補完計畫」(以下稱為 'big5-uao')不兼容,幾個台
> 灣的朋友在 W3C HTML5 中文興趣小組的郵件群上[2]表示不支持這個想法,其中
> PTT 是使用 'big5-uao' 的一大宗,因此有些問題想問你看看你有什麼想法,順便
> 希望你提供一些資料:

Using big5-uao on PTT are mainly due to some historical reasons.
The main protocol for accessing PTT is telnet, not web, and the most
widely used client is PCMan (, which has
built-in big5-uao support.

> 1. 阿菲想問問看你有沒有辦法提供「所有 'big5-hkscs' 和 'big5-uao' 解碼會
> 有差異的文章」[3]?
> 'big5-hkscs' 和 'big5-uao' 的差異在[4]有表格。不過如果弄出「所有」太困難
> 的話,這裡是一些有差異的例子,想請你幫忙看看能不能全文搜出在哪些版會出現
> 這些字節串:

I am not able to do this since the system is crowded; we will have
performance issues if we do such full-text search and decoding

> == 日文 ==
> U+6075 恵(釘宮理恵)
> big5-uao: \x92\xa8
> big5-hkscs: \x93\x7a
> U+54b2 咲(天才麻將少女)
> big5-uao: \x94\x46
> big5-hkscs: \x83\x5a
> U+5b9f 実(真実)
> big5-uao: \x92\xd4
> big5-hkscs: \x89\x63
> == 港文 ==
> U+560b 嘅
> big5-uao: \xa0\x41
> big5-hkscs: \x9d\xef
> U+7740 着
> big5-uao: \x95\x4d
> big5-hkscs: \xfe\xd3
> 這裡主要也是想知道 PTT 裡是不是有人用 'big5-hkscs',特別是香港相關的看板。
> 2. 阿菲想問你對於「 上的日文字在除了 Firefox 以外不能
> 顯示」這件事是怎麼想的?

I am not sure what Firefox is doing. It seems to support some part of uao.
We have plans to switch the web interface to UTF-8, so the above issue
shall not exist.

> ggm 跟我講 "ssh" 有 UTF-8 的版本,所以 要
> 改成 UTF-8 是不是很有可能?為什麼不這樣做?

As mentioned above, there are plans to switch to UTF-8, but there are
some work yet to be done.
First, the "bbsu" mode is simply a translation table (big5-uao to
UTF-8) put in front of the I/O of the program, and is done by solely
one person. The web interface is another project, so these two work
have to be merged.

> (我注意到 Google 搜尋這些頁面都會出現方塊字,例如:[5]。所以不管這個問
> 題最後怎麼樣,把這個改了是不是比較好?)

I agree with you, but it will take time to fix.

> 3. 我想問你你對就「瀏覽器碰到 <meta charset=big5> 該怎麼處理」這個問題怎
> 麼想的?

The browser should use the most standard big5, CNS11643, and may
notify user if the document is suspected to use other character set
(upon the detection of decoding error).

> 4. 你有有多少台灣人有裝 Unicode 補完的概念嗎?(不包括 BBS 客戶端的內建
> 轉換表)可不可以順便請你對「Unicode 補完」這個詞在 PTT 上做全文搜索了解
> 一下現在的趨勢 :p

I think the installation is very little. Most websites are using UTF-8
now, and users want things to work out-of-the-box.
I searched Google with "Unicode 補完", and found many old
posts. Thus, I will say that there is nearly nobody installing it

> p.s. 感謝 ggm 的介紹
> p.s.2 這封郵件會存檔在[6]上,不嫌棄的話平常也可以參與一下 Web 標準相關討
> 論,或是一起來翻譯一下規範[7]。 :)
> p.s.3 阿菲會看也會寫中文,不過你要寫英文也可以。
> [1]
> [2]
> [3] 參考對話紀錄
> [4]
> [5]
> [6]
> [7]
> 此致
> Kenny

Robert Wang

Received on Friday, 27 April 2012 10:32:54 UTC