- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Thu, 3 Nov 2011 13:21:32 +0200
On Thu, Nov 3, 2011 at 1:57 AM, David Flanagan <dflanagan at mozilla.com> wrote: > Firefox, Chrome and Safari all seem to do the right thing: wait for the next > character before tokenizing the CR. See http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1247 Firefox tokenizes the CR immediately, emits an LF and then skips over the next character if it is an LF. When I designed the solution Firefox uses, I believed it was more correct and more compatible with legacy than whatever the spec said at the time. Chrome seems to wait for the next character before tokenizing the CR. > And I think this means that the description of document.write needs to be changed. All along, I've felt thought that having U+0000 and CRLF handling as a stream preprocessing step was bogus and both should happen upon tokenization. So far, I've managed to convince Hixie about U+0000 handling. > Similarly, what should the tokenizer do if the document.write emits half of > a UTF-16 surrogate pair as the last character? The parser operates on UTF-16 code units, so a lone surrogate is emitted. -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/
Received on Thursday, 3 November 2011 04:21:32 UTC