Re: [webcomponents] Template element parser changes => Proposal for adding DocumentFragment.innerHTML from Tab Atkins Jr. on 2012-05-11 (public-webapps@w3.org from April to June 2012)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Fri, 11 May 2012 12:09:34 +0200
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Rafael Weinstein <rafaelw@google.com>, Webapps WG <public-webapps@w3.org>, Yehuda Katz <wycats@gmail.com>
Message-ID: <CAAWBYDDzijHhJu5dSJuHKeR88J0TQ2SZzseJDFTZsyr4A9oTyg@mail.gmail.com>
On Fri, May 11, 2012 at 10:55 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Wed, May 9, 2012 at 7:45 PM, Rafael Weinstein <rafaelw@google.com> wrote:
>> I'm very much of a like mike with Henri here, in that I'm frustrated
>> with the situation we're currently in WRT SVG & MathML & parsing
>> foreign content in HTML, etc... In particular, I'm tempted to feel
>> like SVG and MathML made this bed for themselves and they should now
>> have to sleep in it.
>
> I think that characterization is unfair to MathML.  The math working
> group tried hard to avoid local name collisions with HTML.  They
> didn't want to play namespace games.  As I understand it, they were
> forced into a different namespace by W3C strategy tax arising from the
> "NAMESPACE ALL THE THINGS!" attitude.
>
> SVG is the language that introduced collisions with both HTML and
> MathML and threw unconventional camel casing into the mix.

Tangent: It appears to be a very difficult uphill battle to get SVG
into the HTML namespace or un-namespaced entirely, but how difficult
might it be to do that for MathML?  We can branch this to a separate
thread if the answer is anything other than "Don't even think about
it."

> On Fri, May 11, 2012 at 1:44 AM, Tab Atkins Jr. <jackalmage@gmail.com> wrote:
>> The innerHTML API is convenient.  It lets you set the entire
>> descendant tree of an element, creating elements and giving them
>> attributes, in a single call, using the same syntax you'd use if you
>> were writing it in HTML (module some extra quote-escaping maybe).
>
> I'm less worried about magic in an API that's meant for representing
> tree literals in JavaScript as a sort of E4H without changing the
> JavaScript itself than I am about magic in APIs that are meant for
> parsing arbitrary potentially user-supplied content.

I'm not sure what you mean by the latter.  If this translates as
"improving innerHTML is okay with me", then cool.  ^_^


> If we are designing an API for the former case rather than the latter
> case, I'm OK with the following magic:
>  * Up until the first start tag parser behaves as in "in body" (Tough
> luck if you want to use <![CDATA[  or U+0000 before the first tag,
> though I could be convinced that the parser should start in a mode
> that enables <![CDATA[.)

Ah, so this would mean you just *always* emit the characters before
the first start tag?  That seems perfectly acceptable.  In the
contexts where text-before-the-tag is valid, that's exactly what you
want to do.  In the contexts where it's not, you're doing something
wrong anyway, and there's only so much reasonable fixup we can do
anyway.  Might as well just emit the text.

Don't know why we didn't think of that yet.


>  * if the first start tag is any MathML 3 element name except "set" or
> "image", start behaving as if setting innerHTML on <math> (details of
> that TBD) before processing the start tag token further and then
> continue to behave like when setting innerHTML on <math>.
>  * otherwise, if the first start tag is any SVG 1.1 element name
> except "script", "style", "font" or "a", start behaving as if setting
> innerHTML on <svg> (details of that TBD) before processing the start
> tag token further and then continue to behave like when setting
> innerHTML on <svg>.
>  * otherwise, set the insertion mode per HTML-centric <template> rules
> proposed so far.

I think that SVG should get priority over MathML on the conflicts -
<set> and especially <image> are useful SVG elements that can
reasonably be the first tag of a fragment.  It seems more useful to
get an svg fragment for an <image> than a mathml fragment.


> Open question: Should it be possible to use a magic attribute on the
> first tag token to disambiguate it as MathML or SVG? xmlns="..." would
> be an obvious disambiguator, but the values are unwieldy.  Should
> xlink:href be used as a disambiguator for <a>? If the use case is
> putting tree literals in code, it probably doesn't make sense to use
> <script> or <style> (either HTML or SVG) in that kind of context
> anyway. And SVG <font> has been rejected by Mozilla and Microsoft
> anyway.

SVG is chucking xlink, so that won't work.  In SVG2, <a> will just use
@href.  That conflict is just going to remain hard to resolve.

<font> is less troublesome.  Based on discussion in the SVGWG, we're
okay with resolving the conflict always in HTML's favor.

If others disagree with this, using attributes to disambiguate should
be *really* easy.  The only overlap in attributes is @id, @class, and
@color.  It should definitely be okay to resolve ambiguous cases like
"<font color=red>..." in HTML's favor.

If we did want to use a magic attribute, I suggest @svg and @math as
boolean attributes with no effect other than this disambiguation (and
then just resolving all conflicts in HTML's favor by default).


> I still think that having to create a DocumentFragment first and then
> set innerHTML on it is inconvenient and we should have a method on
> document that takes a string to parse and returns the resulting
> DocumentFragment, e.g. document.parse(string) to keep it short.

I agree.  We should definitely *have* DocumentFragment.innerHTML, but
we should also have a shorter way of doing so and getting the DOM,
like the document.parse() you suggest.

~TJ
Received on Friday, 11 May 2012 10:10:27 UTC