- From: Dimitri Glazkov <dglazkov@chromium.org>
- Date: Thu, 9 Feb 2012 10:35:14 -0800
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: public-webapps <public-webapps@w3.org>, Adam Barth <w3c@adambarth.com>, Ian Hickson <ian@hixie.ch>, Rafael Weinstein <rafaelw@google.com>
On Wed, Feb 8, 2012 at 11:25 PM, Henri Sivonen <hsivonen@iki.fi> wrote: > On Thu, Feb 9, 2012 at 12:00 AM, Dimitri Glazkov <dglazkov@chromium.org> wrote: >> == IDEA 1: Keep template contents parsing in the tokenizer == > > Not this! > > Here's why: > Making something look like markup but then not tokenizing it as markup > is confusing. The confusion leads to authors not having a clear mental > model of what's going on and where stuff ends. Trying to make things > just work for authors leads to even more confusing "here be dragons" > solutions. Check out > http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#script-data-double-escaped-dash-dash-state > > Making something that looks like markup but isn't tokenized as markup > also makes the delta between HTML and XHTML greater. Some people may > be ready to throw XHTML under the bus completely at this point, but > this also goes back to the confusion point. Apart from namespaces, the > mental model you can teach for XML is remarkably sane. Whenever HTML > deviates from it, it's a complication in the understandability of > HTML. > > Also, multi-level parsing is in principle bad for perf. (How bad > really? Dunno.) I *really* don't want to end up writing a single-pass > parser that has to be black-box indishtinguishable from something > that's defined as a multi-pass parser. > > (There might be a longer essay about how this sucks in the public-html > archives, since the SVG WG proposed something like this at one point, > too.) This makes sense. As an aside, this is also why implementing templates as a script tag is a bad idea. > >> == IDEA 2: Just tweak insertion modes == > > I think a DWIM insertion mode that switches to another mode and > reprocesses the token upon the first start tag token *without* trying > to return to the DWIM insertion mode when the matching end tag is seen > for the start tag that switched away from the DWIM mode is something > that might be worth pursuing. If we do it, I think we should make it > work for a fragment parsing API that doesn't require context beyound > assuming HTML, too. (I think we shouldn't try to take the DWIM so far > that a contextless API would try to guess HTML vs. SVG vs. MathML.) Sounds like a good direction to explore. I'll play with this. > The violation of the Degrade Gracefully principle and tearing the > parser spec open right when everybody converged on the spec worry me, > though. I'm still hoping for a design that doesn't require parser > changes at all and that doesn't blow up in legacy browsers (even > better if the results in legacy browsers were sane enough to serve as > input for a polyfill). Yeah, Adam expressed a similar concern. I must admit, before digging into the HTML parsing, I was a bit more bullish about making this "just work". Turns out, there's this delicate balance between the reality and the proper solution. I am still optimistic we can find something that both doesn't look like a gross hack and degrades well in most cases. :DG< > > -- > Henri Sivonen > hsivonen@iki.fi > http://hsivonen.iki.fi/
Received on Thursday, 9 February 2012 18:35:42 UTC