W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2000

Re: Tidy changes javascript code

From: <html-tidy@war-of-the-worlds.org>
Date: Sun, 25 Jun 2000 21:15:27 -0500
Message-Id: <p04320400b57c6a2512c8@[]>
To: html-tidy@w3.org
"Thomas Appel" <thomas.appel@arcormail.de> wrote:

>Dear Sir,
>would you please tell me, why tidy (version of 13th january 2000) changes
>the following java script line
>parent.FRAME2.document.write("</HEAD><BODY><H1>" + Titel +
>parent.FRAME2.document.write("<\/HEAD><BODY><H1>" + Titel +
>adding a backslash in front of every slash in the html-tags. I could not
>find this syntax rule in any of my html-books.

Here we go again.

SCRIPT content is described as CDATA.  The parsing of CDATA content is
terminated by an ETAGO (End Tag Open) sequence ("<" + "/" + alphabetic
character) as a general SGML rule, otherwise how would you know in a
general sense where the CDATA ends and you should look for an end tag?

SCRIPTs must only be terminated with </SCRIPT> (starts with an ETAGO, thus
the need for general recognition of ETAGO for any CDATA tag, including both
SCRIPT and STYLE), so the presence of any other ETAGO sequences is by
definition an error.  The easiest way to eliminate premature ETAGO
sequences is to escape an interior character of the sequence.  The slash is
a natural choice.  Thus the introduction of the backslashes before the

Consider what would happen if you had:


Just as a script would be terminated by the instance of </SCRIPT>, so it
would be terminated in a general sense by any ETAGO sequence due to the
general SGML rule.  A general parser doesn't know anything about the syntax
of scripting languages, so won't care that it is in quotes.

Tidy's changes are necessary for compliance, and function is not altered.
Received on Sunday, 25 June 2000 22:15:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:48 UTC