W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2000

Bug: Error in if statement in parser.c, ParseTableTag()

From: Randy Waki <rwaki@flipdog.com>
Date: Tue, 29 Aug 2000 20:10:49 -0600
To: <dsr@w3.org>, <html-tidy@w3.org>
Message-ID: <002801c01227$84133650$51eee13f@rwaki>
I believe I have a test case that validates a fix previously reported by
Andy Quick.  This fix has not yet been incorporated in either the C or
Java sources.

Andy noted that ParseTableTag() in parser.c has a C coding error and
suggested the following fix, paraphrased below.

*** parser.c    Fri Aug 04 16:32:04 2000
--- new\parser.c        Tue Aug 29 18:42:10 2000
***************
*** 2082,2088 ****
                  ReportWarning(lexer, table, node, TAG_NOT_ALLOWED_IN);
                  lexer->exiled = yes;

!                 if (!node->type == TextNode)
                      ParseTag(lexer, node, IgnoreWhitespace);

                  lexer->exiled = no;
--- 2082,2088 ----
                  ReportWarning(lexer, table, node, TAG_NOT_ALLOWED_IN);
                  lexer->exiled = yes;

!                 if (node->type !== TextNode)
                      ParseTag(lexer, node, IgnoreWhitespace);

                  lexer->exiled = no;

Note that since TextNode is defined to be 4, the original if expression
always evaluates to false because of C operator precedence.  Andy's
change looks reasonable given what seems to be happening there in the
code, plus it matches the other two if statements in parser.c that also
use lexer->exiled (search for "lexer->exiled = yes").

This error causes some inline and block elements, when appearing
illegally as direct children of a table element, to get separated from
their content.  For example, in the document below, the error causes the
text to no longer be enclosed by anchor, font, or bold elements.  This
is contrary to how IE 5 and Netscape 4.5 render it.  Andy's change fixes
this.

I've seen an occasional example where this error actually causes Tidy to
produce *better* results, but I think it's a case of one bug cancelling
out another.

Note that for some reason, this bug stopped occurring in the 4-Aug-2000
tidy.exe (on Windows 2000, at least).  It's a mystery because the C
source still contains the coding error, and there appears to be no other
source differences between 8-Jul-2000 and 4-Aug-2000 that would account
for the bug being fixed.  Perhaps Dave's C compiler has some bizarre
bug, or the C sources are out-of-sync with the executable (I don't have
a C compiler to pursue this; I verified the fix in Java).

------------------------ Example HTML document -------------------------
<html>
<head><title></title></head>
<body>
  <table border="1" summary="">
    <a href="111"><font size="+3"><b>Big and bold</b> Big</font></a>
  </table>
</body>
</html>
------------------------------------------------------------------------

Randy
Received on Tuesday, 29 August 2000 22:11:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT