Re: Tidy of Script contents desc += "</td></table>";

On Mon, 4 Sep 2000, Rzepa, Henry wrote:

> The August  7 Tidy does the following
> 
> <script type="text/javascript" language="JavaScript">
> desc += "</td></table>";
> </script>
> 
> with the warning
> 
>  line 7 column 16 - Warning: '<' + '/' + letter not allowed here
> 
> and writes out ie
> 
> desc += "<\/td><\/table>";
> 
> Because of course it does not detect eg the <td>, formally its correct,
> but in practice of course the  <td> is written out elsewhere.
> 
> Can one suppress this behaviour?  Is it in fact correct? 

The relevant section in the HTML specification (4.01) is at
<http://www.w3.org/TR/html4/types.html#type-cdata>:

| Although the STYLE and SCRIPT elements use CDATA for their data model, for
| these elements, CDATA must be handled differently by user agents. Markup and
| entities must be treated as raw text and passed to the application as is. The
| first occurrence of the character sequence "</" (end-tag open delimiter) is
| treated as terminating the end of the element's content. In valid documents,
| this would be the end tag for the element.

It is the sequence </ (which some programs will treat as the end of the
element, regardless of whether it's actually the right name or not) that
is the problem here -- not specifically an end tag apparently without
a matching start tag.

Tidy's behaviour is therefore correct for HTML4.01 documents. (Though I
expect XHTML is stricter, and requires CDATA element content to use &lt;/
instead.)

-- 
Daniel Biddle <deltab@osian.net>

Received on Monday, 4 September 2000 04:12:18 UTC