Re: Cougar DTD: Do not use CDATA declared content for SCRIPT

Christopher R. Maden (
Fri, 26 Jul 1996 22:09:02 GMT

From: "Christopher R. Maden" <>
Date: Fri, 26 Jul 1996 22:09:02 GMT
Message-Id: <199607262209.WAA04231@phaser.EBT.COM>
In-reply-to: <> (message from Joe English on Fri, 26 Jul 1996 14:52:25 PDT)
Subject: Re: Cougar DTD: Do not use CDATA declared content for SCRIPT

Joe English:
> Not in a sensible implementation...

Ah, but that's the key, isn't it?

We *must* keep in mind (or else the work of the W3C has little
relevance to its members) that the HTML must be parseable by SGML
*and* heuristic parsers.

If every HTML parser were SGML-based, our problems would be trivial.
Users sufficiently sophisticated could write their own DTDs, declare
their own entities, etc.

> In a structure-controlled SGML implementation, the application never
> sees the "<![ CDATA [" and "]]>" markup; these get swallowed by the
> parser, which would hand the content of the SCRIPT element to the
> application unscathed.  The application would then pass it to an
> appropriate script interpreter based on the value of the LANGUAGE
> attribute.
>     <SCRIPT><![ CDATA [
> 	whatever.whichever("Here goes nothing: ]]>]]&gt;<![ CDATA [");
>     ]]></SCRIPT>
> which will yield:
> 	whatever.whichever("Here goes nothing: ]]>");
> 	_______________________________________^^&___

It's true, a heuristic parser could be trained to discard marked
section boundaries before feeding the contents to any client
processor.  But I think you know how likely that is from manufacturers
that had scripts in comments...

<!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM
"<URL> <TEL>+1.401.421.9550 <FAX>+1.401.521.2030
<USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>