Tidy - bug report

Dave:

Though I find your program invaluable, I have I believe just 
discovered a bug, well at least an apparent inconsistency between it 
and the W3C HTML Validation service.

An example of the problem occurs with:

URI: http://www.cs.ncl.ac.uk/genuki/DEV/Census1851searching.html

Checking this with Tidy (used as a plugin to BBEdit on an Apple Mac), I got:

   BBTidy (vers 4th August 2000) Parsing "Census1851searching.html"

   Census1851searching.html: Doctype given is "-//W3C//DTD HTML 4.0 
Transitional//EN"
   Census1851searching.html: Document content looks like HTML 4.01 Transitional
   no warnings or errors were found

But the HTML Validator on the W3C site tells me that:

   Below are the results of attempting to parse this document with an 
SGML parser.
   *	Line 13, column 48:
   <a name="top"><a href="http://www.genuki.org.uk"><img border="0"
                                                   ^
   Error: document type does not allow element "A" here
   *	Line 16, column 12:
   "index.html"><img align="BOTTOM" alt="up" src="../u_arrow.gif"
               ^
   Error: document type does not allow element "A" here
   ------------------------------------------------------------------------
   Sorry, this document does not validate as HTML 4.0 Transitional.

I'd be grateful if you could let me know whether the problem lies 
with me, or whether either Tidy, or the W3C Validator needs to be 
updated.

There is one other aspect of Tidy that I found surprising - this is 
that even after receiving a report that a file is OK, i.e.:

    BBTidy (vers 4th August 2000) Parsing "Census1851searching.html"

    Census1851searching.html: Doctype given is "-//W3C//DTD HTML 4.0 
Transitional//EN"
    Census1851searching.html: Document content looks like HTML 4.01 Transitional
    no warnings or errors were found

then if the output file is re-checked with Tidy, one can on occasion 
still get further warnings about there being occurrences in the file 
of empty <p></p>.

cheers

Brian

-- 
Dept. of Computing Science, University of Newcastle, Newcastle upon Tyne,
NE1 7RU, UK
EMAIL = Brian.Randell@newcastle.ac.uk   PHONE = +44 191 222 7923
FAX = +44 191 222 8232  URL = http://www.cs.ncl.ac.uk/~brian.randell/

Received on Tuesday, 2 January 2001 16:41:09 UTC