- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 17 Nov 2008 17:56:07 +0200
- To: Ian Hickson <ian@hixie.ch>
- Cc: "public-html@w3.org WG" <public-html@w3.org>
On Jun 17, 2008, at 23:58, Ian Hickson wrote: >> Aside: I find the concept of "insertion point" in a stream to be >> harder >> to track than a concept of a stack of pending streams where each >> document.write() pushes a new stream onto the pending stack. > > I don't know if a stack can be equivalent to the insertion point > concept. Right. A stack is insufficient. > It depends whether you keep track of how much you have tokenised for > each > item in your stack, and whether you can append to an item on the > stack. > > Consider: > > <script> > document.write("a<script src=b><\/script>c"); > document.write("d"); > </script>... > > When the inline script is about to be done executing, the input stream > looks like: > > v v > ...ript>a<script src=b></script> cd ... > ^ ^ > T I > > ...where T is the tokeniser's position ("c" is the "next input > character") > and I is the insertion point. However as soon as it is done > executing the > UA will pause for 'b', and if b does a document.write() it'll go > where "T" > is, not where "I" is. OK. Would the following work? There's a queue of UTF-16 buffers and keyed placeholders. That is, there's one queue that contains an interleaving of objects that are UTF-16 buffers or objects holding a magic key value. The buffers have a start position that the tokenizer advances. A buffer can be partially consumed, have its start position advanced accordingly and be left in the queue for further consumption later. The normal tokenization process consumes data from the front of the queue. When a buffer is empty, it is dequeued and the next buffer is consumed. Objects holding magic key values count as empty buffers for the purpose of dequeuing. Exception: There's always at least one buffer object in the queue and the last buffer is never dequeued. Instead, it is left in the queue when it is empty. The network stream always adds data to the last buffer or appends a new buffer to the queue. Each document.write call to the parser comes with a magic key value. The magic key is guaranteed to be the same for all document.write calls from a given script and different from different scripts within a document. On document.write, if there is a pending external script, the queue is searched for a magic key holder with the same key value as the document.write call. If there is such an object in the queue, the text of the document.write call is inserted as an UTF-16 buffer into the queue immediately before the key holder object. If there's no such object in the queue, a key holder with the key for this document.write call is inserted in the front of the queue and then the text is inserted as an UTF-16 buffer in front of that of the key holder. If there's no pending external script, the tokenization of the text argument is attempted immediately with parser suspension for event loops spins disabled. If tree builder causes the parser to block and there are untokenized characters in the text argument, the untokenized tail of the argument is treated as in the previous paragraph. Invariant: The last buffer of the queue is always a buffer that was put in the queue by the parser initializer or by the method that appends data from the network. The last object in the queue never holds a magic key value. (The motivation for not using the same concepts as the spec is that the magic keys is the mechanism Gecko already provides for managing the context of document.writes, and this queuing mechanism never requires a moving UTF-16 data once it has been written into a buffer.) -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Monday, 17 November 2008 15:57:00 UTC