W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2002

Re: new to jTidy, need help urgently

From: Richard A. O'Keefe <ok@cs.otago.ac.nz>
Date: Mon, 4 Mar 2002 14:47:48 +1300 (NZDT)
Message-Id: <200203040147.OAA125085@atlas.otago.ac.nz>
To: html-tidy@w3.org, qpoint@bigfoot.com
"Raymond Tan" <qpoint@bigfoot.com> wrote:
	Hello everyone.=20 I am new to jTidy.  I am using jTidy to
	convert parse a HTML from a URL = to XML.  but I am getting a
	lot of errors.  I urgently need this function = for my school
	project.  Appreciate any help rendered!=20
	
You don't show us the page you are converting.

	the following are the generated error log from jTidy.=20
	
	Tidy (vers 4th August 2000) Parsing "InputStream"=20
	line 11 column 1 - Warning: <style> isn't allowed in <html> elements=20

This means you have
    <html>
      <style> ... </style>
      <head> ... </head>
      <body> ... </body>
    </html>
but <style> is NOT ALLOWED as a child of <html>.  Instead you should have
    <html>
      <head>
        <style> ... </style>
        ...
      </head>
      <body> ... </body>
    </html>
If you don't have an explicit <head> element, better put one in.
<head> is where <style>, <meta>, <link>, <title>, and a few
other things can go.

	line 56 column 1 - Warning: <table> lacks "summary" attribute=20

What it says:  each <table> element should have a summary="text" attribute
saying what the table is about.

	line 67 column 1 - Warning: <div> isn't allowed in <table> elements=20

What it says:  <div> is NOT allowed inside a <table>.
<table> may contain these and ONLY these:
 <caption>	- a label for the whole table
 <col>		- a way of specifying attributes for an entire column
 <colgroup>	- a way of specifying attributes for a group of columns
 <thead>	- a group of rows to appear at the front, and if the table
		  is split across pages, at the top of each following part
		  of the table
 <tfoot>	- a group of rows to appear at the end, and if the table
		  is split across pages, at the foot of each preceding part
		  of the table
 <tbody>	- a group of rows, not to be repeated.  The main part of the
		  table.
 <tr>		- a single row.
No other elements whatsoever are allowed as direct children of a <table>
element, they would make no sense at all.

	line 69 column 5 - Warning: <td> repeated attribute=20

This means you have an element <td foo="X" foo="Y" ...>
where some attribute (foo in my example) occurs twice.  This is not allowed
in any SGML document, and still isn't allowed in XML.  (Although thanks to
the great "namespaces" cock-up, it _is_ possible to have two attributes with
the same URI and local name...)

	line 222 column 1 - Warning: discarding unexpected </div>=20
	
Probably caused by all those illegal <div> start-tags.
Received on Sunday, 3 March 2002 20:47:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:51 GMT