Scripts as Comments [Was: Comments in markup?]

Murray Altheim (murray@spyglass.com)
Tue, 30 Jul 1996 11:51:01 -0500


Message-Id: <v0211010aae23e1f64bb5@[140.186.34.50]>
Date: Tue, 30 Jul 1996 11:51:01 -0500
To: Foteos Macrides <MACRIDES@SCI.WFBR.EDU>
From: murray@spyglass.com (Murray Altheim)
Subject: Scripts as Comments [Was: Comments in markup?]
Cc: connolly@w3.org, www-html@w3.org

[This thread seems to have been spawned by the use of comment declarations
as script containers. An ugly solution if ever there was one, IMO.]

Foteos Macrides <MACRIDES@SCI.WFBR.EDU> writes:
>murray@spyglass.com (Murray Altheim) wrote:
>>[...]
>>Comments are only allowed in declarations. And the only declarations within
>>HTML document instances you'll find are comment declarations -- all the
>>rest is considered markup. You'll note that you can't put <!ENTITY ..>
>>declarations within the document instance either.
>
>        Could you elaborate on these issues in relation to marked
>sections and the scripting issue?

I'm assuming you're asking me to elaborate on the following three paragraphs:

>        <!-- is OK in a document instance, and ultimately should end
>with a --> or --white> but can't reliably hide a script which might
>contain -- as a decrementer, or --, -->, etc., as strings in script
>statments.

I'll again point to my documentation on HTML comments to clarify the
correct structure of a comment declaration, as your language is somewhat
misleading but on the whole true. Regarding the actual issue here (using
comment declarations to hide scripts) yes, you're correct. I think it
unwise to build a structure that requires authors to regard the structure
of one language (SGML/HTML) while writing a script in another, unrelated
language (Javascript, perl, etc.), ie., the occurrence of double hyphens in
HTML is not something one should be thinking about when writing in
Javascript. Not very elegant and seems quite a hack.

>        <SCRIPT ...>...</SCRIPT> is almost OK with "add hoc" parsers
>that just look for the end tag, but those still could get tripped
>up if </SCRIPT> were in a script statement and not really the end
>tag, and it seems basically to have the same problem for real SMGL
>parses as do PLAINTEXT and XMP.
>
>        <![[ also has the problem that a script statement might include
>what looks like it's terminator.

You'll always be dealing with backward-compatibility. This should properly
be handled by server-side content negotiation.

I think it's simply a matter here of requiring script-capable browsers to
understand SGML marked section syntax. Vendors are busy modifying browsers
to handle Javascript; it's simply part of that solution. Marked sections
are not a difficult parse, and would enable its use for other legitimate
SGML structures as well. And it is at least in theory something HTML should
allow if it truly calls itself 8879-compliant. I think the incidence of
"]]>" occurring in Javascript or perl sufficiently rare enought to not be
that much of a problem. Has ANYBODY ran across a real-world need to ever
use those three characters in a script?

>        So isn't it in fact true that the ONLY way to include script
>code "100%" safely in an HTML document instance is as an encoded (hex
>or BASE64) attribute value?

No, I think marked sections are much simpler. And Dan's suggestion of
properly labelling NOTATION is right on target. On a related note, I do
think the TYPE attribute on STYLE should be changed to NOTATION (as in
other SGML applications), and here as well this seems appropriate. You
might find documents containing both VB and Javascripts, and each container
needs to be marked. Using comments to contain scripts allows only a hack
for container labelling as well.

Daniel W. Connolly <connolly@w3.org> writes:
>An alternative would be to define some NOTATIONS in the HTML DTD, and
>only refer to them in the instances. But then the HTML DTD becomes a
>centralized list of script languages -- it would need to be modified
>every time a new scripting language was deployed.

Given that the vendors involved in providing support for those scripting
languages are part of W3C, this list is pretty fixed for now. By the time a
new NOTATION is needed, I'd think a new DTD could be out in support of it.
New scripting languages also require *hundreds* of confusing how-to books,
and these take at least a few weeks to write...  :-)

I think graceful deployment will continue to elude us until we fully
support much more of 8879. I can think of other places this is true as
well.

Murray

```````````````````````````````````````````````````````````````````````````````
     Murray Altheim, Program Manager
     Spyglass, Inc., Cambridge, Massachusetts
     email: <mailto:murray@spyglass.com>
     http:  <http://www.stonehand.com/murray/murray.html>
            "Give a monkey the tools and he'll eventually build a typewriter."