Re: Proposal: Document.parse() [AKA: Implied Context Parsing]

On Fri, May 25, 2012 at 12:32 AM, Simon Pieters <simonp@opera.com> wrote:
> On Fri, 25 May 2012 09:01:43 +0200, Rafael Weinstein <rafaelw@google.com>
> wrote:
>
>> Ok, so from consensus on earlier threads, here's the full API & semantics.
>>
>> Now's the time to raise objections to UA's adding support for this
>> feature.
>>
>> -----
>>
>> 1) The Document interface is extended to include a new method:
>>
>> DocumentFragment parse (DOMString markup);
>>
>> which:
>> -Invokes the fragment parsing algorithm with markup and an empty
>> context element,
>> -Unmarks all scripts in the returned fragment node as "already started"
>> -Returns the fragment node
>>
>> 2) The fragment parsing algorithm's context element is now optional.
>>
>> It's behavior is similar to the case of a known context element, but
>> the tokenizer is simply set to the data state
>>
>> 3) Resetting the insertion appropriately now sets the mode to "Implied
>> Context" if parsing a fragment and no context element is set, and
>> aborts.
>>
>> 4) A new "Implied Context" insertion mode is defined which
>>
>> -Ignores doctype, end tag tokens
>> -Handles comment & character tokens as if "in body"
>> -Handles the following start tags as if "in body" (which is as if "in
>> head"): <style>, <script>, <link>, <meta>
>> -Handles any other start tag by selecting a context element, resetting
>> the insertion mode appropriately and reprocessing the token.
>>
>> 5) A new "selecting a context element" algorithm is defined which
>> takes a start tag as input and outputs an element. The element's
>> identity is as follows:
>>
>> -If start tag is tbody, thead, tfoot, caption or colgroup
>>  return <table>
>> -if start tag is tr,
>>  return <tbody>
>> -if start tag is col
>>  return <colgroup>
>> -if start tag is td or td
>>  return <tr>
>> -if start tag is head or body
>>  return <html>
>> -if start tag is rp or rt
>>  return <ruby>
>
>
> I think <ruby> is better handled by always making <rp> and <rt> generate
> implied end tags in the fragment case (maybe even when parsing normally,
> too). Making the context element <ruby> still doesn't make <rt> parse right,
> because the spec currently looks for ruby on the *stack* (and the context
> element isn't on the stack).
>
> Also, the ruby base is allowed to include markup, so this would fail:
>
> ruby.appendChild(document.parse('<span>foo</span><rt>bar<rt>baz'));
>
>
>
>> -if start tag is a defined SVG localName (case insensitive)
>>  return <svg>
>
>
> Except those that conflict with HTML?

Yes. Thank you. Item 5 should be:

5) A new "selecting a context element" algorithm is defined which
takes a start tag as input and outputs an element. The element's
identity is as follows:

-If start tag is tbody, thead, tfoot, caption or colgroup
 return <table>
-if start tag is tr,
 return <tbody>
-if start tag is col
 return <colgroup>
-if start tag is td or td
 return <tr>
-if start tag is head or body
 return <html>
-if start tag is rp or rt
 return <ruby>

-if start tag is a defined HTML localName (case insensitive)
 return <body>

-if start tag is a defined SVG localName (case insensitive)
 return <svg>

-if start tag is a defined MathML localName (case insensitive)
 return <math>

-otherwise, return <body>


>
>
>> -if start tag is a defined MathML localName (case insensitive)
>>  return <math>
>
>
> (Making the context element svg or math doesn't do anything currently:
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=16635 )
>
>> -otherwise, return <body>
>
>
>
> --
> Simon Pieters
> Opera Software

Received on Friday, 25 May 2012 12:35:34 UTC