W3C home > Mailing lists > Public > www-validator@w3.org > January 2007

Re: Some bug report on markup-validator

From: Rui del-Negro <w3validator@dvd-hq.info>
Date: Tue, 23 Jan 2007 23:14:56 -0000
To: www-validator@w3.org
Message-ID: <op.tmmqu6jejf0k3w@bigbang>

>> David Håsäther <hasather@gmail.com> wrote:
>>
>> Making such a common character sequence (--), which is often used as
>> an ASCII alternative to a "long dash", illegal inside comments  
>> definitely doesn't help.
>
> The reason for this is that in SGML, a comment _declaration_ can contain  
> more than one comment (actually, zero or more).
>
> Example:
>
>    <!-- inside the first comment  --
>      -- inside the second comment -->

And how is that useful? What else would exist _outside_ the comments (but  
still _inside_ the comment declaration)?

If a "comment declaration" was defined as a single block (starting with  
<!-- and ending with -->), the example you gave would look and function in  
the same way, but any number of dashes would be perfectly legal at any  
point inside the comment, as long as they weren't immediately followed by  
">".

And regardless of the (lack of) benefits from allowing "multiple comments  
inside one comment declaration", "--" is still a very poorly picked  
sequence, because it's something commonly used in "human" text, not just  
as a replacement for a long dash, but also as a separator (ex., it's  
common to see comments in C / PHP / etc. that look like this: "//  
--------------------"). But you can't do the same in SGML (unless you make  
sure the number of dashes is a multiple of four).

Finally, using exactly the same sequence to _start_ and _end_ a comment  
("--" in both cases) seems like another shot in the foot.

I'm not saying that's not the way SGML behaves, I'm just saying it's a  
stupid way to behave, for the three reasons mentioned above.

I can't say it's ever caused me any problems (I rarely use HTML comment  
blocks; most of my pages are generated by some script, so I put the  
comments there), but Jiku's example is a glaring one. If you have, in your  
code, a link (or any reference) to "http://i--i.com/", you can't comment  
out that section of the markup. The same goes for any text where that  
sequence (--) appears as part of the actual page. You comment out a  
section (to see what the page looks like without it), and you've just  
created invalid code.

I remember that back in the late 90s there were some initiatives to  
correct this in XML, and I had assumed that (in XHTML), everything between  
<!-- and --> was assumed to be part of one comment. Guess not.

RMN
~~~
Received on Tuesday, 23 January 2007 23:15:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:23 GMT