Re: Running "Zombie" Script Elements from Jonas Sicking on 2009-06-11 (public-html@w3.org from June 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Thu, 11 Jun 2009 16:52:42 -0700
To: Ian Hickson <ian@hixie.ch>
Cc: Travis Leithead <Travis.Leithead@microsoft.com>, "public-html@w3.org" <public-html@w3.org>, Harley Rosnow <Harley.Rosnow@microsoft.com>, Kirk Sykora <ksykora@microsoft.com>
Message-ID: <63df84f0906111652k7204fe01j1ea91e851be25d10@mail.gmail.com>

On Thu, Jun 11, 2009 at 12:03 PM, Ian Hickson<ian@hixie.ch> wrote:
> On Sat, 23 May 2009, Jonas Sicking wrote:
>> On Fri, May 22, 2009 at 11:36 PM, Ian Hickson <ian@hixie.ch> wrote:
>> > On Fri, 22 May 2009, Travis Leithead wrote:
>> >>
>> >> <body>
>> >>  <div>
>> >>   <span>
>> >>    <script>
>> >>     var d = document.querySelector(“body > div”);
>> >>     d.parentNode.removeChild(d);
>> >>    </script>
>> >>    <code>
>> >>     <script>
>> >>      alert(“a running zombie script?”);
>> >>     </script>
>> >>    </code>
>> >>   </span>
>> >>  </div>
>> >> </body>
>> >
>> > The second script gets executed by the "Run the script." sentence in
>> > 9.2.5.11 The "in CDATA/RCDATA" insertion mode, under "An end tag whose
>> > tag name is "script"".
>> >
>> > Basically when a <script> element is handled by the parser, it gets
>> > parsed regardless of what the DOM looks like.
>>
>> Is there a reason for things to be designed this way?
>
> It's done this way because when this element is inserted into the DOM,
> it's empty, so we can't execute it yet. So it has to be special-cased --
> either by making the element not be inserted into the document until the
> end tag is seen or implied, or by making the script handling be a special
> case. It turns out that far a variety of reasons, the latter is
> significantly easier and helps with other things as well (such as
> defining exactly how document.write() interacts with the parser, which
> requires a special case here anyway).

So implementation wise the implementation will have to set some
special flag on the script element to tell it not to execute when it
otherwise normally would. The implementation works something like
this:

scriptElement::onTextContentChange()
{
  maybeEvalScript();
}

scriptElement::onDocumentChange()
{
  maybeEvalScript();
}

scriptElement::onAttrChange()
{
  maybeEvalScript();
}

scriptElement::maybeEvalScript()
{
  if (!didEvaluate && !wasCreatedByParser && isInDocument() &&
      (getTextContent() != "" || hasSrcAttribute)) {
    didEvaluate = true;
    <evaluate script>;
  }
}

Once the parser is done creating the script, it'll call some function
on the script to tell it to go. If this function simply unsets the
wasCreatedByParser flag and calls maybeEvalScript, or if that function
evaluates the script directly, doesn't really make a difference from
an implementation complexity view.

So we'd never need to not insert the script into the DOM (that would
be more complex all around I'd think).

The difference implementation wise is very small, simply choosing
which one of two functions to call.

As stated before, I don't have a strong preference either way. It
seems like good consistency arguments can be made either way, and
implementation wise both solutions are equally simple to implement.

/ Jonas

Received on Thursday, 11 June 2009 23:53:40 UTC