W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 1999

doctype loop in tidy7jul99

From: Peter Lewerin <peter.lewerin@krax.pp.se>
Date: Wed, 21 Jul 1999 23:52:00 +0200
Message-Id: <3.0.6.32.19990721235200.007f02f0@mailbox.swipnet.se>
To: html-tidy@w3.org

There seems to be a problem with the doctype handling in the July 7
release.  My limited testing shows that if 1) I specify "doctype: strict"
(or loose or transitional, but not omit, auto or "-//W3C//DTD HTML
4.0//EN") in a configuration file, and 2) the HTML document to be tidied
begins with a <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN"> line, Tidy
outputs the normal messages to STDERR but only an "endless" repetition of
doctype declarations to STDOUT (I became suspicious when my index.html file
contained 2,695,007  doctype declarations (137,445,376 byte) but no
HTML... ;-) ).

There is, of course, the possibility that I am doing something wrong.
Having four months old twins, you never seem to get enough sleep...

Also baffling (to me, at least):

When I tidy the following simple HTML:

<title>foo</title>

like this:

tidy -config tidy.cfg try.html >try2.html

and tidy.cfg contains:

clean: yes
doctype: strict
char-encoding: latin1

I get:

Tidy (vers 7th July 1999) Parsing "try.html"

"try.html" appears to be HTML 2.0
no warnings or errors were found

and try2.html looks like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">
<html>
<head>
<title>foo</title>
</head>
</html>

and I tidy try2.html, like this:

tidy -config tidy.cfg try2.html

I get:

Tidy (vers 7th July 1999) Parsing "try2.html"
line 6 column 1 - Warning: discarding unexpected </html>

"try2.html" appears to be HTML 2.0
1 warnings/errors were found!

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">
<l>
<l>
<l>foo</l>
</l>
</l>

HTML & CSS specifications are available from http://www.w3.org/
To learn more about Tidy see http://www.w3.org/People/Raggett/tidy/
Please send bug reports to Dave Raggett care of <html-tidy@w3.org>
Lobby your company to join W3C, see http://www.w3.org/Consortium

???

The 'tags' seem to be dereferenced dangling pointers (the string is
"l\1\3").  I am using the Win32 console .exe from the archive on a Win98
system.  Recompiling with DJGPP for a DOS Protected mode executable gives
the same results.
Received on Wednesday, 21 July 1999 17:53:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:42 GMT