W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2000

Bug+Fix for inserted nodes

From: Gary L Peskin <garyp@firstech.com>
Date: Wed, 20 Dec 2000 01:17:18 -0800
Message-ID: <3A40791E.E135F0DD@firstech.com>
To: Html-Tidy <html-tidy@w3.org>
Inserted nodes are being created with incorrect node->end values in
certain cases.

The following java example program (provided by
dglo@users.sourceforge.net) illustrates the problem:

import java.io.ByteArrayInputStream; 
import org.w3c.tidy.Tidy; 

public class NodeBug 
  public static final void main(String[] args) 
    String badHTML = "<html><font><center></center></p>\n\n</html>"; 

    Tidy tidy = new Tidy(); 
    tidy.parseDOM(new ByteArrayInputStream(badHTML.getBytes()),

Similar results are obtained with the parse() method.  I don't have a c
compiler so I can't reliably produce c code here which causes the same
problem but it should look approximately the same.

The problem occurs, I believe, at istack.c line 242 in method
InsertedToken.  The line 

  node->end = lexer->txtstart;

should be changed to read

  node->end = lexer->txtend;

Received on Wednesday, 20 December 2000 04:17:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:49 UTC