Even numbers of nested CM_INLINE tags get removed (both)

Anyone else notice that HTML Tidy is removing paired instances of nested
presentational style tags (CM_INLINE)?  E.g.

<HTML>
<HEAD>
<TITLE>Paired nested inline styles lost</TITLE>
</HEAD>
<BODY>
<DL>
<DT>Bold test (other CM_INLINE tags affected)</DT>
<DD>
<B>This is bold</B><BR>
<B><B>This is 2x bold</B></B> which becomes normal<BR>
<B><B><B>This is 3x bold</B></B> which stays bold<BR>
<B><B><B><B>This is 4x bold</B></B></B></B> which becomes normal
</DD>

<DT>BIG test (also exists with SMALL)</DT>
<DD>
Normal size text (size = 3)<BR>
<BIG>Big text (size = 4)</BIG><BR>
<BIG><BIG>2x big text (size = 5)</BIG></BIG> which becomes normal size (3)<BR>
<BIG><BIG><BIG>3x big text (size = 6)</BIG></BIG></BIG> which becomes big
(size 4)<BR>
<BIG><BIG><BIG><BIG>4x big text (size = 6)</BIG></BIG></BIG></BIG> which
becomes normal size (3)
</DD>
</DL>
</BODY>
</HTML>

It doesn't matter whether the tags are directly nested or not.

I have a partial patch for this.  In parser.c function ParseInline, if you
replace the cited if with:

         /* <u>...<u>  map 2nd <u> to </u> if 1st is explicit */
        /* otherwise emphasis nesting is probably unintentional */
        if (node->type == StartTag
            && IsPushed(lexer, node)
            && node->tag && (node->tag->model & CM_INLINE)
            && node->tag != tag_a
            && node->tag != tag_font
            && node->tag != tag_big   /* big has a cumulative effect */
            && node->tag != tag_small /* small has cumulative effect */
        {

in html.h, add:

extern Dict *tag_big;
extern Dict *tag_small;

and in tags.c, add:

Dict *tag_big;
Dict *tag_small;

and, again in tags.c, in InitTags, add:

    tag_big = lookup("big");
    tag_small = lookup("small");

and that will address the issue of removal of BIG and SMALL tags (which, as
commented, have a cumulative effect, so their being repeated may not be a
mistake, only bad style, more efficiently replaced with, e.g. FONT SIZE of
appropriate relative adjustment according to level of nesting), but not the
erroneous removal of even numbers of the other CM_INLINE tags when nested.
-- 
         ,=<#)-=#  <http://www.war-of-the-worlds.org/>
    ,_--//--_,
 _-~_-(####)-_~-_  "Did you see that Parkins boy's body in the tunnels?" "Just
(#>_--'~--~`--_<#)  the photos.  Worst thing I've ever seen; kid had no face."

Received on Wednesday, 5 January 2000 17:29:43 UTC