W3C home > Mailing lists > Public > public-html@w3.org > July 2007

Non-sensical catch-all handling of end tags in "in body" (detailed review of parsing algorithm)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Thu, 5 Jul 2007 13:03:13 +0300
Message-Id: <711A4DE4-D7B1-42E8-8C00-E45B81D5CB93@iki.fi>
To: "public-html@w3.org WG" <public-html@w3.org>

(This is part of my detailed review the parsing algorithm.)

The spec says:
> An end tag token not covered by the previous entries
>
>     Run the following algorithm:
>
>        1. Initialise node to be the current node (the bottommost  
> node of the stack).
>        2. If node has the same tag name as the end tag token, then:
>              1. Generate implied end tags.
>              2. If the tag name of the end tag token does not match  
> the tag name of the current node, this is a parse error.
>              3. Pop all the nodes from the current node up to node,  
> including node, then stop this algorithm.
>        3. Otherwise, if node is in neither the formatting category  
> nor the phrasing category, then this is a parse error. Stop this  
> algorithm. The end tag token is ignored.
>        4. Set node to the previous entry in the stack of open  
> elements.
>        5. Return to step 2.

The sublist doesn't make sense. If the current node has the same tag  
name as the token, the stack should be popped. Generating implied end  
tags makes no sense.

The algorithm should probably read as follows:
1. If the current node has the same tag name as the end tag token,  
pop the current node off the stack and then stop this algorithm.
2. Generate implied end tags.
3. If the current node has the same tag name as the end tag token,  
pop the current node off the stack and then stop this algorithm.
4. If the current node is in the formatting category or in the  
phasing category, then this is a parse error. Pop the current node  
off the stack and then return to step 2.
5. Otherwise, stop this algorithm.

Note that both formulations seem to make a stray </td> in "in body"  
not to be silently ignored by as closing open formatting or phrasing  
elements. Is this right? Should popping phrasing or formatting  
elements first check if there's an element in scope with the same tag  
name as the token?

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Thursday, 5 July 2007 10:03:51 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:46 UTC