W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2001

Re: BUG: Multiple unterminated <meta> tags in <head> block confuse parser.

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Sat, 13 Oct 2001 02:08:41 +0200
To: lee@novonyx.com
Cc: "html-tidy@w3.org" <html-tidy@w3.org>
Message-ID: <550fstc1nqrfuq776an5a6bjdhg4ou5ja5@4ax.com>
* Lee Passey wrote:
>I am trying to tidy one of those bloated MS-Word .htm files.  In the
><head> block there are multiple untermintated <meta> tags, as follows:
>
><head>
><meta http-equiv=Content-Type content="text/html; charset=windows-1252">
><meta name=ProgId content=Word.Document>
><meta name=Generator content="Microsoft Word 9">
><meta name=Originator content="Microsoft Word 9">
><link rel=File-List href="./filelist.xml">
><title>MS Word document</title>
><!--[if gte mso 9]><xml>
> <o:DocumentProperties>
>... etc.

We have a bug here, Tidy puts out

[...]
  <!--[if gte mso 9]><xml>
   <o:DocumentProperties>
  >
  </head>
  <body>
  </body>
  </html>

For this sole fragment, i.e. it inserts some weired '>' and it doesn't
close the comment. However, as of today, it only complains about

  line 8 column 1 - Warning: adjacent hyphens within comment

(which is again a bug, a related bug I guess) I can't reproduce your

>The August 2000 version of tidy seems to deal with this fine, but my
>build based on CVS of about September 25, 2001 prints the error:
>
>line 7 column 73 - Warning: <meta> element not empty or not closed
>line 7 column 73 - Warning: inserting missing 'title' element

In fact, in your example only line 2 has 72 columns.

Could you please provide either a better test case or if the test case
is fine a configuration file to reproduce the problem?

regards,
-- 
Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/
Received on Friday, 12 October 2001 20:09:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:46 GMT