Sogl 被列入 HTML 规范贡献者名单!

QQ 群的一個群友 Sogl(Twitter @jackmasa)有一次在群裡問在 HTML 裡 "<!>"
是怎麼解析的,那時候發現到 HTML 規範是這樣寫的:

  # Consume every character up to and including the first U+003E
  # GREATER-THAN SIGN character (>) or the end of the file (EOF),
  # whichever comes first. Emit a comment token whose data is the
  # concatenation of all the characters starting from and including the
  # character that caused the state machine to switch into the bogus
  # comment state, up to and including the character immediately before
  # the last consumed character (i.e. up to the character just before
  # the U+003E or EOF character), but with any U+0000 NULL characters
  # replaced by U+FFFD REPLACEMENT CHARACTER characters. (If the
  # comment was started by the end of the file (EOF), the token is
  # empty.)

這邊有的問題是這樣:

  # Emit a comment token whose data is the concatenation of all the
  # characters starting from and including the character that caused
  # the state machine to switch into the bogus comment state, up to and
  # including the character immediately before the last consumed
  # character.

(從「將狀態機切換到不合法註解狀態的字符」到「最後一個吃入的字符之前的字
符」作為資料輸出一個註解 token。)

這裡在 <!> 這個例子裡「將狀態機切換到不合法註解狀態的字符」是 ">"(如果
是 "--" 就會切換成合法註解狀態),「最後一個吃入的字符之前的字符」是
"!",這裡「從...到」有點問題,不過用常識來理解就是空字串。

可惜的是,做後誇號裡的解釋:

  # (If the comment was started by the end of the file (EOF), the token
  # is empty.)

有提到 "<!(EOF)" 這種情形,沒有 "<!>"。所以 Sogl 就發了一個錯誤回報[1]要
求改正,最後 Hixie 改成了:

  | (If the comment was started by the end of the file (EOF), the token
  | is empty. Similarly, the token is empty if it was generated by the
  | string "<!>".)

還順便問 Sogl 要不要列入貢獻者列表。後來就列進去了 :)


之前風之石和 winter 的貢獻[2],如果有興趣話,也是可以考慮要不要私下聯絡
一下 Hixie 要不要列入貢獻列表的。

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=15634
[2]
http://www.w3.org/html/ig/zh/wiki/Contributions#.E9.94.99.E8.AF.AF.E5.9B.9E.E6.8A.A5


以上

Kenny

Received on Wednesday, 6 June 2012 02:49:38 UTC