- From: Robert <robertabcd@gmail.com>
- Date: Tue, 24 Apr 2012 00:01:00 +0800
- To: "Kang-Hao (Kenny) Lu" <kennyluck@csail.mit.edu>
- Cc: W3C HTML5 中文興趣小組 <public-html-ig-zh@w3.org>, Philip Jägenstedt <philipj@opera.com>, ggm <godgunman@gmail.com>
Hi all, I am afraid that I am not capable or able to answer all of these questions. But I will try my best. 在 2012年4月13日下午4:44,Kang-Hao (Kenny) Lu <kennyluck@csail.mit.edu> 寫道: > 哈囉 挺宇 > > > Opera 的 Anne van Kesteren 最近在寫一份瀏覽器的編碼如何處理的規範[1]。 > Opera 的阿菲(Cc 列上的 Philip Jägenstedt)和 Anne 有提到或許把瀏覽器碰 > 到 <meta charset=big5> 的內容以 'big5-hkscs' 解碼這個可能性,但是由於 > 'big5-hkscs' 和「Unicode 補完計畫」(以下稱為 'big5-uao')不兼容,幾個台 > 灣的朋友在 W3C HTML5 中文興趣小組的郵件群上[2]表示不支持這個想法,其中 > PTT 是使用 'big5-uao' 的一大宗,因此有些問題想問你看看你有什麼想法,順便 > 希望你提供一些資料: Using big5-uao on PTT are mainly due to some historical reasons. The main protocol for accessing PTT is telnet, not web, and the most widely used client is PCMan (http://pcman.openfoundry.org/), which has built-in big5-uao support. > > 1. 阿菲想問問看你有沒有辦法提供「所有 'big5-hkscs' 和 'big5-uao' 解碼會 > 有差異的文章」[3]? > > 'big5-hkscs' 和 'big5-uao' 的差異在[4]有表格。不過如果弄出「所有」太困難 > 的話,這裡是一些有差異的例子,想請你幫忙看看能不能全文搜出在哪些版會出現 > 這些字節串: > I am not able to do this since the system is crowded; we will have performance issues if we do such full-text search and decoding comparison. > == 日文 == > > U+6075 恵(釘宮理恵) > > big5-uao: \x92\xa8 > big5-hkscs: \x93\x7a > > U+54b2 咲(天才麻將少女) > > big5-uao: \x94\x46 > big5-hkscs: \x83\x5a > > U+5b9f 実(真実) > > big5-uao: \x92\xd4 > big5-hkscs: \x89\x63 > > == 港文 == > > U+560b 嘅 > > big5-uao: \xa0\x41 > big5-hkscs: \x9d\xef > > U+7740 着 > > big5-uao: \x95\x4d > big5-hkscs: \xfe\xd3 > > > 這裡主要也是想知道 PTT 裡是不是有人用 'big5-hkscs',特別是香港相關的看板。 > > > 2. 阿菲想問你對於「http://www.ptt.cc/ 上的日文字在除了 Firefox 以外不能 > 顯示」這件事是怎麼想的? I am not sure what Firefox is doing. It seems to support some part of uao. We have plans to switch the web interface to UTF-8, so the above issue shall not exist. > > ggm 跟我講 "ssh bbsu@ptt.cc" 有 UTF-8 的版本,所以 http://www.ptt.cc 要 > 改成 UTF-8 是不是很有可能?為什麼不這樣做? As mentioned above, there are plans to switch to UTF-8, but there are some work yet to be done. First, the "bbsu" mode is simply a translation table (big5-uao to UTF-8) put in front of the I/O of the program, and is done by solely one person. The web interface is another project, so these two work have to be merged. > > (我注意到 Google 搜尋這些頁面都會出現方塊字,例如:[5]。所以不管這個問 > 題最後怎麼樣,把這個改了是不是比較好?) I agree with you, but it will take time to fix. > > > 3. 我想問你你對就「瀏覽器碰到 <meta charset=big5> 該怎麼處理」這個問題怎 > 麼想的? The browser should use the most standard big5, CNS11643, and may notify user if the document is suspected to use other character set (upon the detection of decoding error). > > 4. 你有有多少台灣人有裝 Unicode 補完的概念嗎?(不包括 BBS 客戶端的內建 > 轉換表)可不可以順便請你對「Unicode 補完」這個詞在 PTT 上做全文搜索了解 > 一下現在的趨勢 :p I think the installation is very little. Most websites are using UTF-8 now, and users want things to work out-of-the-box. I searched Google with "Unicode 補完 site:ptt.cc", and found many old posts. Thus, I will say that there is nearly nobody installing it manually. > > > p.s. 感謝 ggm 的介紹 > > p.s.2 這封郵件會存檔在[6]上,不嫌棄的話平常也可以參與一下 Web 標準相關討 > 論,或是一起來翻譯一下規範[7]。 :) > > p.s.3 阿菲會看也會寫中文,不過你要寫英文也可以。 > > [1] > http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html#legacy-multi-byte-chinese-%28traditional%29-encodings > [2] > http://lists.w3.org/Archives/Public/public-html-ig-zh/2012Apr/thread#msg1 > [3] 參考對話紀錄 http://krijnhoetmer.nl/irc-logs/whatwg/20120412#l-569 > [4] http://moztw.org/docs/big5/ > [5] > https://www.google.com/webhp?hl=zh-tw#hl=zh-TW&site=webhp&q=ptt.cc+C_Chat+%E5%8F%B0%E6%B9%BE%E4%BA%BA&oq=ptt.cc+C_Chat+%E5%8F%B0%E6%B9%BE%E4%BA%BA&aq=f&aqi=&aql=&gs_l=serp.3...10350l12390l4l12884l12l12l0l0l0l0l441l1276l5j6j4-1l12l0.frgbld.&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=125eff699114053d&biw=1258&bih=661 > [6] http://lists.w3.org/Archives/Public/public-html-ig-zh/2012Apr/thread > [7] http://www.w3.org/html/ig/zh/wiki/Translation > > > 此致 > > Kenny Robert Wang
Received on Friday, 27 April 2012 10:32:54 UTC