W3C home > Mailing lists > Public > public-webapps@w3.org > April to June 2012

Re: [webcomponents] HTML Parsing and the <template> element

From: Dimitri Glazkov <dglazkov@chromium.org>
Date: Mon, 2 Apr 2012 15:21:08 -0700
Message-ID: <CADh5Ky2_C0ZBtQxOE0jmGsr6yAKKV5LMFyXqnmdF=78hDqZ-9w@mail.gmail.com>
To: Henri Sivonen <hsivonen@iki.fi>
Cc: public-webapps <public-webapps@w3.org>, Adam Barth <w3c@adambarth.com>, Ian Hickson <ian@hixie.ch>, Rafael Weinstein <rafaelw@google.com>, Erik Arvidsson <arv@google.com>, Yehuda Katz <wycats@gmail.com>
On Wed, Feb 8, 2012 at 11:25 PM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Thu, Feb 9, 2012 at 12:00 AM, Dimitri Glazkov <dglazkov@chromium.org> wrote:
>> == IDEA 1: Keep template contents parsing in the tokenizer ==
> Not this!
> Here's why:
> Making something look like markup but then not tokenizing it as markup
> is confusing. The confusion leads to authors not having a clear mental
> model of what's going on and where stuff ends. Trying to make things
> just work for authors leads to even more confusing "here be dragons"
> solutions. Check out
> http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#script-data-double-escaped-dash-dash-state
> Making something that looks like markup but isn't tokenized as markup
> also makes the delta between HTML and XHTML greater. Some people may
> be ready to throw XHTML under the bus completely at this point, but
> this also goes back to the confusion point. Apart from namespaces, the
> mental model you can teach for XML is remarkably sane. Whenever HTML
> deviates from it, it's a complication in the understandability of
> Also, multi-level parsing is in principle bad for perf. (How bad
> really? Dunno.) I *really* don't want to end up writing a single-pass
> parser that has to be black-box indishtinguishable from something
> that's defined as a multi-pass parser.
> (There might be a longer essay about how this sucks in the public-html
> archives, since the SVG WG proposed something like this at one point,
> too.)
>> == IDEA 2: Just tweak insertion modes ==
> I think a DWIM insertion mode that switches to another mode and
> reprocesses the token upon the first start tag token *without* trying
> to return to the DWIM insertion mode when the matching end tag is seen
> for the start tag that switched away from the DWIM mode is something
> that might be worth pursuing. If we do it, I think we should make it
> work for a fragment parsing API that doesn't require context beyound
> assuming HTML, too. (I think we shouldn't try to take the DWIM so far
> that a contextless API would try to guess HTML vs. SVG vs. MathML.)

Just to connect the threads. A few weeks back, I posted an update
about the HTML Templates spec:

Perhaps lost among other updates was the fact that I've gotten the
first draft of HTML Templates spec out:


The draft is roughly two parts: motivation for the spec and deltas to
HTML specification to allow serialization and parsing of the
<template> element. To be honest, after finishing the draft, I
wondered if we should just merge the whole thing into the HTML

As a warm-up exercise for the draft, I first implemented the changes
to tree construction algorithm here in WebKit
(https://bugs.webkit.org/show_bug.cgi?id=78734). The patch
includes new parsing tests, and should be fairly intuitive to read to
those familiar with the test format.

The interesting bit here is that all parser changes are additive: we
are only adding what effectively are extensions points -- well, that
and a new contextless parsing mode for when inside of the <template>

> The violation of the Degrade Gracefully principle and tearing the
> parser spec open right when everybody converged on the spec worry me,
> though. I'm still hoping for a design that doesn't require parser
> changes at all and that doesn't blow up in legacy browsers (even
> better if the results in legacy browsers were sane enough to serve as
> input for a polyfill).

I agree with your concern. It's bugging me too -- that's why I am not
being an arrogant jerk yelling at people and trying to shove this
through. In general, it's difficult to justify making changes to
anything that's stable -- especially considering how long and painful
the road to getting stable was. However, folks like Yehuda, Erik, and
Rafael spent years tackling this problem, and I tend to trust their
steady hand... hands?

Received on Monday, 2 April 2012 22:21:37 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 11 February 2015 14:36:58 UTC