- From: Kang-Hao (Kenny) Lu <kennyluck@w3.org>
- Date: Wed, 06 Jun 2012 10:49:10 +0800
- To: W3C HTML5 中文興趣小組 <public-html-ig-zh@w3.org>
QQ 群的一個群友 Sogl(Twitter @jackmasa)有一次在群裡問在 HTML 裡 "<!>" 是怎麼解析的,那時候發現到 HTML 規範是這樣寫的: # Consume every character up to and including the first U+003E # GREATER-THAN SIGN character (>) or the end of the file (EOF), # whichever comes first. Emit a comment token whose data is the # concatenation of all the characters starting from and including the # character that caused the state machine to switch into the bogus # comment state, up to and including the character immediately before # the last consumed character (i.e. up to the character just before # the U+003E or EOF character), but with any U+0000 NULL characters # replaced by U+FFFD REPLACEMENT CHARACTER characters. (If the # comment was started by the end of the file (EOF), the token is # empty.) 這邊有的問題是這樣: # Emit a comment token whose data is the concatenation of all the # characters starting from and including the character that caused # the state machine to switch into the bogus comment state, up to and # including the character immediately before the last consumed # character. (從「將狀態機切換到不合法註解狀態的字符」到「最後一個吃入的字符之前的字 符」作為資料輸出一個註解 token。) 這裡在 <!> 這個例子裡「將狀態機切換到不合法註解狀態的字符」是 ">"(如果 是 "--" 就會切換成合法註解狀態),「最後一個吃入的字符之前的字符」是 "!",這裡「從...到」有點問題,不過用常識來理解就是空字串。 可惜的是,做後誇號裡的解釋: # (If the comment was started by the end of the file (EOF), the token # is empty.) 有提到 "<!(EOF)" 這種情形,沒有 "<!>"。所以 Sogl 就發了一個錯誤回報[1]要 求改正,最後 Hixie 改成了: | (If the comment was started by the end of the file (EOF), the token | is empty. Similarly, the token is empty if it was generated by the | string "<!>".) 還順便問 Sogl 要不要列入貢獻者列表。後來就列進去了 :) 之前風之石和 winter 的貢獻[2],如果有興趣話,也是可以考慮要不要私下聯絡 一下 Hixie 要不要列入貢獻列表的。 [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=15634 [2] http://www.w3.org/html/ig/zh/wiki/Contributions#.E9.94.99.E8.AF.AF.E5.9B.9E.E6.8A.A5 以上 Kenny
Received on Wednesday, 6 June 2012 02:49:38 UTC