[whatwg] Comment Syntax and Parsing

On Mon, 23 Jan 2006, Lachlan Hunt wrote:
> 
> Well, for what it's worth, I still don't think you were being stupid, I think
> you were right all along and had this been implemented by more than just
> Mozilla 7 years ago, the result may have been different.

Authors find the -- thing unbelievably confusing.

Why does:

  <!-- Hello
    -- World
    -- How does <comment> work?
    -- I don't know.
    -- Do you?
    -->

...work, but this:

  <!-- Hello World
    -- How does <comment> work?
    -- I don't know.
    -- Do you?
    -->

...or this:

  <!-- Hello
    -- World
    -- How does <comment> work?
    -- I don't know. Do you?
    -->

...not? Authors just don't get it.

It makes more sense when you have draconian error handling, but HTML 
doesn't.


> [...] all of those vendors have unanimously voted against implementing 
> proper comment handling in favour of quirks-mode-style parsing, there 
> really isn't a choice in the matter.

(What HTML5 says isn't really quirks mode comment parsing, it's even 
simpler.)


> > Probably the same as XML. Or maybe just "<!--" followed by zero or 
> > more characters other than U+0000, followed by "-->".
> 
> I vote for keeping it very similar to XML, it'll be easier for authors 
> only having to learn and remember one comment syntax.

Plus CSS's. Plus Javascript's. So three syntaxes, at least.

...and this is assuming they'll ever use XML.


> > Yeah. The question is do we really want to confuse people by telling 
> > them that their comment is invalid when they write:
> > 
> >    <!----------------------------->
> 
> Yes, for backwards compatibility reasons.

Fair enough. We can always allow it later.


> Another question is, do we wish to continue allowing white space like this:
> <!-- comment --   >
> 
> I believe it's supported by all browsers without any difficulty

Actually, it isn't. In most browsers that I tested the above gets treated 
as an unclosed comment which is then re-parsed in "close at first >" mode. 
Since we're dropping the re-parse mode (see earlier mails), this goes away 
with it.

You can test whether or not it's really supported by comparing these:

   <!-- > --> --> EOF
   <!-- > -- > --> EOF
   <!-- > --> EOF
   <!-- > -- > EOF

...in my script:

   http://software.hixie.ch/utilities/js/live-dom-viewer/

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Sunday, 22 January 2006 21:14:11 UTC