- From: Victor Wagner <vitus@ice.ru>
- Date: Mon, 29 Nov 1999 13:00:30 +0300 (MSK)
- To: html-tidy@w3.org
- Message-ID: <Pine.LNX.4.10L0.9911291249260.6631-300000@zinc.fe.msk.ru>
Dear sir,
I've downloaded latest version of html-tidy to use as preprocessor in
our web-based publishing system, but found out that it segfaults
on some ill-formed html. Particular piece of html was produced by
RTF::Parser perl module ver 1.07, but I suspect that other converters
from office formats would give simular results.
When linked with electric fence debugging library, it gives following
in gdb:
(gdb) run bad.html
Starting program: /usr/local/src/tidy24nov99/tidy bad.html
Electric Fence 2.0.5 Copyright (C) 1987-1995 Bruce Perens.
Tidy (vers 24th November 1999) Parsing "bad.html"
line 8 column 35 - Warning: missing </b> before <p>
line 8 column 35 - Warning: missing </i> before <p>
line 8 column 37 - Warning: <i> is probably intended as </i>
line 8 column 37 - Warning: trimming empty <p>
Program received signal SIGSEGV, Segmentation fault.
0x804f26f in GetToken (lexer=0x40933f80, mode=0) at lexer.c:1195
1195 if (lexer->token->type != TextNode || (!lexer->insert &&
!lexer->inode))
(gdb) bt
#0 0x804f26f in GetToken (lexer=0x40933f80, mode=0) at lexer.c:1195
#1 0x804c5a6 in ParseBody (lexer=0x40933f80, body=0x4094afc8, mode=0)
at parser.c:2411
#2 0x8049f77 in ParseTag (lexer=0x40933f80, node=0x4094afc8, mode=0)
at parser.c:357
#3 0x804cf45 in ParseHTML (lexer=0x40933f80, html=0x4093cfc8, mode=0)
at parser.c:2908
#4 0x804d027 in ParseDocument (lexer=0x40933f80) at parser.c:2955
#5 0x805811e in main (argc=2, argv=0xbffffca4) at tidy.c:983
(gdb)
Without debugging info and -lefence tidy still
craches on this file, but in some obscure place inside malloc.
Disabling optimization doesn't help.
My platform is Linux x86 glibc 2.0.7 gcc 2.7.2
Piece of ill-formed html and my tidy_config.txt are attached.
--------------------------------------------------
Victor Wagner vitus@ice.ru
Programmer Office:7-(095)-203-50-60
Institute for Commerce Home: 7-(095)-135-46-61
Engineering http://www.ice.ru/~vitus
Attachments
- TEXT/PLAIN attachment: Piece of html which crashes tidy
- TEXT/PLAIN attachment: Tidy config
Received on Monday, 29 November 1999 05:00:35 UTC