[whatwg] Unterminated comments in <textarea>, <script>, <style>, <title>, and <iframe>

Recently, I've been testing how browser parsers handle unterminated
<!-- comments -->.  Internet Explorer 7, Firefox 3, Safari 3.1, and
Opera 9.5 agree on the following cases:

http://crypto.stanford.edu/~abarth/research/html5/comments/open-textarea.html
http://crypto.stanford.edu/~abarth/research/html5/comments/open-script.html
http://crypto.stanford.edu/~abarth/research/html5/comments/open-style.html

Essentially, they treat the <!-- as if it did not start a comment.
Ian pointed out on IRC that this might be a security vulnerability
because the result of parsing the stream depends on whether the parser
hung or terminated at the end of the stream.  (If the parser had hung,
it would be awaiting more characters for the comment.)

The above browsers almost agree for on the behavior for <title>:

http://crypto.stanford.edu/~abarth/research/html5/comments/open-title.html

Internet Explorer 7, Firefox 3, and Opera 9.5 treat treat <!-- as if
it did not start a comment.  Safari 3.1 differs slightly and only uses
the portion before the <!-- as the title, but otherwise parses the
remainder of the document as if <!-- did not start a comment.

The above browsers differ in their handling of unterminated comments
for the <iframe> element:

http://crypto.stanford.edu/~abarth/research/html5/comments/open-iframe.html

Internet Explorer 7 and Safari 3.1 follow the spec and consume the
remainder of the document in the comment.  Firefox 3 and Opera 9.5
treat <!-- as if it did not start a comment.

As I understand it, browser behavior for <textarea>, <script>,
<style>, and <title> differs from the spec.  It is unclear whether
browsers will change to match the spec, especially because the
<script> element might contain <!-- sequences in string literals or
regular expressions (e.g.,
<http://crypto.stanford.edu/~abarth/research/html5/comments/open-script-in-string.html>).

Adam

Received on Thursday, 26 June 2008 22:12:26 UTC