In body other end tag handling convoluted to get at most one error from Henri Sivonen on 2008-04-04 (public-html@w3.org from April 2008)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Fri, 4 Apr 2008 11:34:10 +0300
To: HTML WG <public-html@w3.org>
Message-Id: <9E766843-52B0-45EF-B02A-4F2DD69E34CD@iki.fi>

> An end tag token not covered by the previous entries
>
>     Run the following algorithm:
>
>        1. Initialise node to be the current node (the bottommost  
> node of the stack).
>        2. If node has the same tag name as the end tag token, then:
>              1. Generate implied end tags.
>              2. If the tag name of the end tag token does not match  
> the tag name of the current node, this is a parse error.
>              3. Pop all the nodes from the current node up to node,  
> including node, then stop this algorithm.
>        3. Otherwise, if node is in neither the formatting category  
> nor the phrasing category, then this is a parse error. Stop this  
> algorithm. The end tag token is ignored.
>        4. Set node to the previous entry in the stack of open  
> elements.
>        5. Return to step 2.

I think the above formulation is confusing, because it runs through  
complicated steps in the simple case: when the node on the spec indeed  
matches the token. It seems weird to generate implied end tags if the  
node has has the same name as the end tag token until you realize you  
are supposed to get to a point where node isn't the top aka. bottom of  
the stack.

It seems to me that the whole purpose of the complication (searching  
stack first and then batch-popping instead of popping as the search  
proceeds) is to give on error about premature end tag instead of  
giving many error one per each unclosed element.

I've implemented the latter.

>                         if (isCurrent(name)) {
>                             pop();
>                             return;
>                         }
>                         for(;;) {
>                             generateImpliedEndTags();
>                             if (isCurrent(name)) {
>                                 pop();
>                                 return;
>                             }
>                             StackNode<T> node = stack[currentPtr];
>                             if (!(node.scoping || node.special)) {
>                                 err("Unclosed element \u201C" +  
> node.name
>                                         + "\u201D.");
>                                 pop();
>                             } else {
>                                 err("Stray end tag \u201C" + name
>                                         + "\u201D.");
>                                 return;
>                             }
>                         }
>


It seems to me that in order to convert this to emit at most one  
error, a boolean willWhine flag would make the algorithm clearer than  
what the spec has now.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Friday, 4 April 2008 08:34:51 UTC