Re: [webcomponents] HTML Parsing and the <template> element from Ryosuke Niwa on 2012-02-08 (public-webapps@w3.org from January to March 2012)

From: Ryosuke Niwa <rniwa@webkit.org>
Date: Wed, 8 Feb 2012 14:41:37 -0800
To: Dimitri Glazkov <dglazkov@chromium.org>
Cc: public-webapps <public-webapps@w3.org>, Henri Sivonen <hsivonen@iki.fi>, Adam Barth <w3c@adambarth.com>, Ian Hickson <ian@hixie.ch>, Rafael Weinstein <rafaelw@google.com>
Message-ID: <CABNRm61CVC4XOkN74x_mR1Tb=p7NN2XzD0tE4=VTcUibi+axgg@mail.gmail.com>

On Wed, Feb 8, 2012 at 2:00 PM, Dimitri Glazkov <dglazkov@chromium.org>wrote:

> == IDEA 1: Keep template contents parsing in the tokenizer ==
>
> PRO: if we could come up with a way to perceive the stuff between
> <template> and </template> as a character stream, we enable a set of
> use cases where the template contents does not need to be a complete
> HTML subtree. For example, I could define a template that sets up a
> start of a table, then a few that provide repetition patterns for
> rows/cells, and then one to close out a table:
>
> <template id="head"><table><caption>Nyan-nyan</caption><thead> ...
> <tbody></template>
> <template id="row"><tr><template><td> ... </td></template></tr></template>
> <template id="foot"></tbody></table></template>
>
> Then I could slam these templates together with some API and produce
> an arbitrary set of tables.
>

But that could be done in the second approach as well, right? All you need
to do is replace "..." by <span class="placeholder"></span> and you can
replace that element later by some API.


> CON: Tokenizer needs to be really smart and will start looking a lot
> like a specialized parser. At first glance, <template> behaves much
> like a <textarea> -- any tags inside will just be treated as
> characters. It works until you realize that templates sometimes need
> to be nested. Any use case that involves building a
> larger-than-one-dimensional data representation (like tables) will
> involve nested templates.


I think we should first discuss and agree on whether we want nested
template elements or not, and how it should behave.

It could be argued that--while pursuing the tokenizer algorithm
> perfection--we could just stop at some point of complexity and issue a
> stern warning for developers to not get too crazy, because stuff will
> break -- akin to including "</script>" string in your Javascript code.
>

I don't think we want to introduce a new variant of </script>. It's way too
complicated as is.

PRO: It's a lot less intrusive to the parser -- just adjust insertion
> modes to allow <template> tags in places where they would ordinary be
> ignored or foster-parented, and add a new insertion for template
> contents to let all tags in. I made a quick sketch here:
>
> http://dvcs.w3.org/hg/webcomponents/raw-file/c96f051ca008/spec/templates/index.html#parsing
> (Note: more massaging is needed to make it really work)
>
> CON: You can't address fun partial-tree scenarios.
>

Could you elaborate on this point? This approach seems much more manageable
to implement and will have much less surprising behaviors.

- Ryosuke

Received on Wednesday, 8 February 2012 22:42:28 UTC