Re: Implied Context Parsing (DocumentFragment.innerHTML, or similar) proposal details to be sorted out

On Wed, May 16, 2012 at 4:52 PM, Rafael Weinstein <rafaelw@google.com> wrote:
> On Wed, May 16, 2012 at 4:49 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>> On Wed, May 16, 2012 at 4:29 PM, Rafael Weinstein <rafaelw@google.com> wrote:
>>> Ok. I think I'm convinced on all points.
>>>
>>> I've uploaded a webkit patch which implements what we've agreed on here:
>>>
>>> https://bugs.webkit.org/show_bug.cgi?id=84646
>>>
>>> I'm happy to report that this patch is nicer than the queued-token
>>> approach. Good call, Henri.
>>>
>>> On Tue, May 15, 2012 at 9:39 PM, Yehuda Katz <wycats@gmail.com> wrote:
>>>>
>>>> Yehuda Katz
>>>> (ph) 718.877.1325
>>>>
>>>>
>>>> On Tue, May 15, 2012 at 6:46 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
>>>>>
>>>>> On Fri, May 11, 2012 at 10:04 PM, Rafael Weinstein <rafaelw@google.com>
>>>>> wrote:
>>>>> > Issue 1: How to handle tokens which precede the first start tag
>>>>> >
>>>>> > Options:
>>>>> > a) Queue them, and then later run them through tree construction once
>>>>> > the implied context element has been picked
>>>>> >
>>>>> > b) Create a new insertion like "waiting for context element", which
>>>>> > probably ignores end tags and doctype and inserts character tokens and
>>>>> > comments. Once the implied context element is picked, reset the
>>>>> > insertion mode appropriately, and procede normally.
>>>>>
>>>>> I prefer b).
>>>>
>>>>
>>>> I like b as well. I assume it means that the "waiting for context element"
>>>> insertion mode would keep scanning until the ambiguity was resolved, and
>>>> then enter the appropriate insertion mode. Am I misunderstanding?
>>>
>>> I think what Yehuda is getting at here is that there are a handful of
>>> tags which are allowed to appear anywhere, so it doesn't make sense to
>>> "resolve the ambiguity" based on their identity.
>>>
>>> I talked with Tab about this, and happily, that set seems to be
>>> <style>, <script>, <meta>, & <link>. Happily, because this means that
>>> the new "ImpliedContext" insertion mode can handle start tags as
>>> follows (code from the above patch)
>>>
>>> if (token.name() == styleTag
>>>    || token.name() == scriptTag
>>>    || token.name() == metaTag
>>>    || token.name() == linkTag) {
>>>    processStartTagForInHead(token); // "process following the rules
>>> for the "in head" insertion mode"
>>>    return;
>>> }
>>>
>>> m_fragmentContext.setContextTag(getImpliedContextTag(token.name()));
>>> "set the context element"
>>> resetInsertionModeAppropriately(); "reset the insertion mode appropriately"
>>> processStartTag(token); // "reprocess the token"
>>
>> So if I understand things correctly, that would mean that:
>>
>> document.parse("parsed as text<script>parsed as script
>> content</script><tr><td>table content</td></tr>");
>>
>> would return a fragment like:
>> #fragment
>>  #text "parsed as text"
>>  script
>>    #text parsed as script content
>>  tr
>>    td
>>      #text table content
>>
>> Is this correct? The important part here is that the contents of the
>> <script> element is parsed according to the rules which normally apply
>> when parsing scripts?
>>
>> (That of course leaves the terrible situation that <script> parsing is
>> vastly different in HTML and SVG, but that's a bad problem that
>> already exists)
>
> Yes. Exactly.

That leaves the question of if the contents of the <script> should be
parsed as a HTML script or an SVG script. The same question applies to
<style>.

Of course, ideally we would make the two parse the same way, but so
far I've not been successful in convincing people here that that's a
good idea.

/ Jonas

Received on Wednesday, 16 May 2012 23:55:49 UTC