[whatwg] Parsing: comment tokenization

On Sat, 7 Apr 2007, Anne van Kesteren wrote:
>
> The tokenization section should also handle:
> 
>  <!-->
>  <!--->
> 
> as "correct" comments for compat with the web. This means that
> 
>  <!-->-->
> 
> shows "-->" and that
> 
>  <!--->-->
> 
> shows "-->".

These comments are not handled (though not conformant).


On Sat, 7 Apr 2007, Nicholas Shanks wrote:
> 
> Why on earth is this a good idea?

IE7 does it. The assumption is that content therefore depends on it.

> AFAIK browsers and other HTML clients don't currently treat these as 
> comments

This seems to disagree with my research.


> [...] compelling them to do so will cause several problems:
> 
> 1) Web developers currently expect things like <!-->5?--> to result in 
> the comment "greater than five?". Changing such expectations on a whim 
> is harmful.

It is not clear to me that this is indeed true.


> 2) A double HYPHEN-MINUS delimits comments within tags, this provides 
> compatibility with XML and SGML and changing this needlessly in HTML5 
> will just complicate conversion.

This, unfortunately, is impractical. (I say this despite having personally 
pushed for this for years.)


> 3) You claim "compat with the web" but don't provide any evidence to 
> support that. Are there huge numbers of sites expecting <!--> to 
> represent a comment without content? Can such sites not be fixed instead 
> of polluting HTML with additional rules? I'd rather have a handful of 
> broken sites that their authors will fix than saying to the other 99% of 
> authors "hey, you can now do this" and ending up with millions of broken 
> sites. (I say broken, because they will not be backwards compatible with 
> current or previous UAs)

It seems that they will in fact be compatible; but I agree, we shouldn't 
encourage it. The spec makes them non-conforming.


On Sat, 7 Apr 2007, Nicholas Shanks wrote:
>
> Even you must (begrudgingly?) admit that "comments" formatted as in your 
> original post are not backwards compatible, even if they do reflect the 
> state of modern UAs as you say.

How can both those statements be true?


> I don't believe I am 'pretending' anything. Just stating that diverging 
> further from SGML for No Good Reason is pointless. (And yes, supporting 
> a few odd websites that do this already counts as not a Good Reason, 
> websites can always be fixed!)

Sadly, Web sites can't always be fixed. Many sites have been long 
abandoned and are no longer updated.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 19 June 2007 01:41:59 UTC