- From: Victor Wagner <vitus@ice.ru>
- Date: Mon, 29 Nov 1999 13:00:30 +0300 (MSK)
- To: html-tidy@w3.org
- Message-ID: <Pine.LNX.4.10L0.9911291249260.6631-300000@zinc.fe.msk.ru>
Dear sir, I've downloaded latest version of html-tidy to use as preprocessor in our web-based publishing system, but found out that it segfaults on some ill-formed html. Particular piece of html was produced by RTF::Parser perl module ver 1.07, but I suspect that other converters from office formats would give simular results. When linked with electric fence debugging library, it gives following in gdb: (gdb) run bad.html Starting program: /usr/local/src/tidy24nov99/tidy bad.html Electric Fence 2.0.5 Copyright (C) 1987-1995 Bruce Perens. Tidy (vers 24th November 1999) Parsing "bad.html" line 8 column 35 - Warning: missing </b> before <p> line 8 column 35 - Warning: missing </i> before <p> line 8 column 37 - Warning: <i> is probably intended as </i> line 8 column 37 - Warning: trimming empty <p> Program received signal SIGSEGV, Segmentation fault. 0x804f26f in GetToken (lexer=0x40933f80, mode=0) at lexer.c:1195 1195 if (lexer->token->type != TextNode || (!lexer->insert && !lexer->inode)) (gdb) bt #0 0x804f26f in GetToken (lexer=0x40933f80, mode=0) at lexer.c:1195 #1 0x804c5a6 in ParseBody (lexer=0x40933f80, body=0x4094afc8, mode=0) at parser.c:2411 #2 0x8049f77 in ParseTag (lexer=0x40933f80, node=0x4094afc8, mode=0) at parser.c:357 #3 0x804cf45 in ParseHTML (lexer=0x40933f80, html=0x4093cfc8, mode=0) at parser.c:2908 #4 0x804d027 in ParseDocument (lexer=0x40933f80) at parser.c:2955 #5 0x805811e in main (argc=2, argv=0xbffffca4) at tidy.c:983 (gdb) Without debugging info and -lefence tidy still craches on this file, but in some obscure place inside malloc. Disabling optimization doesn't help. My platform is Linux x86 glibc 2.0.7 gcc 2.7.2 Piece of ill-formed html and my tidy_config.txt are attached. -------------------------------------------------- Victor Wagner vitus@ice.ru Programmer Office:7-(095)-203-50-60 Institute for Commerce Home: 7-(095)-135-46-61 Engineering http://www.ice.ru/~vitus
Attachments
- TEXT/PLAIN attachment: Piece of html which crashes tidy
- TEXT/PLAIN attachment: Tidy config
Received on Monday, 29 November 1999 05:00:35 UTC