Re: [whatwg] Another issue in 12.2.5.5 parsing tokens in foreign content from Ian Hickson on 2013-07-04 (public-whatwg-archive@w3.org from July 2013)

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 4 Jul 2013 00:45:38 +0000 (UTC)
To: Michael Day <mikeday@yeslogic.com>
Cc: whatwg@whatwg.org, Adam Barth <w3c@adambarth.com>
Message-ID: <Pine.LNX.4.64.1307040040360.20404@ps20323.dreamhostps.com>

On Thu, 4 Jul 2013, Michael Day wrote:
> > 
> > We don't have any data that says that we need to support this for 
> > innerHTML. I think it's a win if we can drop the hack from innerHTML.
> 
> Okay, so allowing some HTML elements to break out of foreign content is 
> a hack added for historical reasons, that will surprise authors and 
> complicate implementations and is thus regrettable, but necessary.
> 
> Then there are two possibilities for fragment parsing:
> 
> (1) The hack can be left out of fragment parsing, as there is no 
> historical justification for it. Since the hack is bad, removing it from 
> as many situations as possible is good.
> 
> (2) The hack can apply to fragment parsing in the same way as it applies 
> to regular parsing. This makes parsing behaviour more consistent across 
> different situations, which is good.
> 
> I'm strongly in favour of (2), as it seems that omitting the hack from 
> some rare situations doesn't save authors any trouble, and doesn't 
> follow the principle of least surprise.

The problem is that we can't do (2) in _all_ cases, e.g. innerHTML on an 
<svg> can't possibly break out of the <svg> if it sees one of these tags, 
since that's the "root" of what is being parsed.

Given that, it's not clear that (2) is better than (1). (I agree that if 
we could actually always be consistent, it would be.)

Note that this isn't the only place like that.

   <table>
    <div>
   </table>

...and:

   document.createElement('table').innerHTML = '<div>';

...result in very different DOMs (in the first, the <div> and the 
<table> are siblings; in the latter, the <div> is a child).


> In an ideal world it would be possible to grab any subsection of a 
> document, parse that in isolation as a fragment, and get the same result 
> as if it was parsed in its original document context. This is possible 
> in XML, but not HTML, due to the existing "author-friendly" hacks, and 
> making the parsing behaviour even more context sensitive doesn't seem 
> like a good thing.

I think we're _so_ far beyond this ideal world that I'm not sure it's 
worth even looking for it, to be honest. :-)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Thursday, 4 July 2013 00:46:03 UTC