Re: Comments in HTML (fwd)

S.N.Brodie@ecs.soton.ac.uk
Wed, 12 Jun 1996 17:20:12 +0100 (BST)


From: S.N.Brodie@ecs.soton.ac.uk
Message-Id: <9908.9606121620@strachey.ecs.soton.ac.uk>
Subject: Re: Comments in HTML (fwd)
To: megazone@livingston.com, www-html@w3.org
Date: Wed, 12 Jun 1996 17:20:12 +0100 (BST)
In-Reply-To: <no.id> from "snb94r" at Jun 11, 96 11:36:26 pm

snb94r wrote:
> 
> MegaZone wrote:

[we were discussing whether it was possible to or not to terminate comment
parsing at > characters in order to support broken HTML.  I had a major
memory fault regarding how my HTML parser handles finding > without --
whilst parsing a comment]

I have now tested a variety of test files with Netscape and verified that
my own browser matches its behaviour (except that I'd added a hack
to allow --!> to terminate comments, which I've now removed, and for the
behaviour when no > is found in the document)

The behaviour is:

When you find <!-- remember where you are, and search onwards for the
character sequence -- followed by any amount of whitespace followed by the
> terminator.  This corresponds to the specification, and is adhered to.

However, whilst skipping over the comment, the location of the *first* >
character is noted (and discarded later if a correct terminator is found).

If no correct terminator is found (ie you find EOF) , the input stream
is rewound to the location of that '>' character and parsing restarted
from the next character.  If you never found a >, then return to the
original < and treat it as if it had been written &lt; 

In fact Netscape 1.22 renders the following as:    Start <!C- Comment

<title>test<body>Start <!-- Comment

so that's obviously broken, but hardly a major fault and it's in an old
version.

-- 
Stewart Brodie, Electronics & Computer Science, Southampton University.
http://www.ecs.soton.ac.uk/~snb94r/      http://delenn.ecs.soton.ac.uk/