- From: Philip Taylor <excors@gmail.com>
- Date: Sun, 21 Mar 2010 21:09:10 +0000
- To: Herman Venter <hermanv@microsoft.com>
- Cc: "public-html-comments@w3.org" <public-html-comments@w3.org>
On Sat, Mar 20, 2010 at 9:17 PM, Herman Venter <hermanv@microsoft.com> wrote: > Hi > > I’m working on a prototype HTML5 parser for research purposes and recently > bumped into this little bit of markup: > > <SCRIPT type=text/javascript><!-- site js --></SCRIPT> > > Trawling around the Web it seems that the expectation is that the script > engine will ignore a (first?) line starting with <!--. > > The EcmaScript standard does not provide for this (at least not the last > time I’ve read through it), so I thought perhaps the issue will be addressed > in the new HTML5 standard. http://wiki.whatwg.org/wiki/Web_ECMAScript#HTML_comments lists some details of this. I think the idea is it has to be handled by the scripting language spec (not by HTML5), e.g. changing the <script type> in Firefox can change it from interpreting the <!--...--> as comments to interpreting them as literal XML comment syntax in E4X, and the <!-- comment thing works in external .js files too, and so if ECMAScript doesn't specify this then it's a bug in ECMAScript. > However, looking at the latter, I find myself hopelessly confused about the > meaning of the “Script data escape start state” that is entered when <! is > encountered inside a script tag body. > > As best as I can make things out, the end result is that the HTML comment in > the above example is just passed through to the script engine, which leaves > the question of what the script engine should do with the non compliant > syntax dangling in the air. It should always pass the text through unchanged - the purpose of this is to handle cases like <script><!-- document.write("<script>alert(1)</script>"); alert(2); // --></script> where the inner </script> is 'escaped' by the <!-- so that it doesn't close the outer script element. The parsing algorithm has become extremely complicated in order to maximise compatibility with legacy content while avoiding the reparsing behaviour that most current implementations have. http://wiki.whatwg.org/wiki/CDATA_Escapes has some of the earlier attempts at finding a solution. > I’m also left bemused by the purpose of the “Script data escape start > state”. Some non normative text the explains the rationale behind this > state, amplified by some concrete examples, would be a great improvement to > the standard. This is definitely an area where greater clarity would be nice! > Sincerely > > Herman Venter -- Philip Taylor excors@gmail.com
Received on Sunday, 21 March 2010 21:11:58 UTC