- From: Sam Ruby <rubys@us.ibm.com>
- Date: Thu, 11 Dec 2008 05:39:05 -0500
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: HTML WG <public-html@w3.org>
- Message-ID: <OF03AFCFB8.F1E1B6F3-ON8525751C.003937F4-8525751C.003A82B4@us.ibm.com>
Henri Sivonen <hsivonen@iki.fi> wrote on 12/10/2008 06:45:13 PM: > > document.write is not the problem here. There's a problem with > speculating past <svg> ... <style> even when no document.writes occur. > > If we arrive at <style> without seeing <svg> or <math> before it, we > know for sure that the tokenizer goes into CDATA variant of the data > state next. However, if we see a <style> start tag after having seen > <svg> or <math>, we don't (trivially) know if actually performing the > tree building would have bailed out of foreign content before reaching > <style>. Therefore, we don't know if the tokenizer should go into the > CDATA or PCDATA variant of the data state for continued speculation. > > The obvious course of action is to stop saving the tokens from that > point onwards even if still looking for more src values to GET with > less accuracy, but it would be nice to be able to do better. Speculative evaluation of instruction streams on a modern CPU given the presence of conditional branch instructions doesn't mean determining with certainty the correct path every time, it simply means getting it right enough of the time to make a difference. Even if you can't reliably determine if you "would have bailed out", you might be able to do better than the rather pessimistic approach mentioned above. Considerably better. The design of HTML 5 is focused on robustness, even in the face of errors, and even if those errors are relatively infrequent. A simple approximation: <svg> or <math> starts foreign content, </svg> and </math> stops foreign content may be right enough of the time to make a difference. You still would have to decide what to do with nesting, and how to detect whether the prediction was incorrect (i.e., any time after the tree builder bails even once, it must stop trusting the token stream at the point it encounters a <style> tag). > -- > Henri Sivonen > hsivonen@iki.fi > http://hsivonen.iki.fi/ - Sam Ruby
Received on Thursday, 11 December 2008 11:08:24 UTC