- From: Richard A. O'Keefe <ok@cs.otago.ac.nz>
- Date: Mon, 4 Mar 2002 14:47:48 +1300 (NZDT)
- To: html-tidy@w3.org, qpoint@bigfoot.com
"Raymond Tan" <qpoint@bigfoot.com> wrote:
Hello everyone.=20 I am new to jTidy. I am using jTidy to
convert parse a HTML from a URL = to XML. but I am getting a
lot of errors. I urgently need this function = for my school
project. Appreciate any help rendered!=20
You don't show us the page you are converting.
the following are the generated error log from jTidy.=20
Tidy (vers 4th August 2000) Parsing "InputStream"=20
line 11 column 1 - Warning: <style> isn't allowed in <html> elements=20
This means you have
<html>
<style> ... </style>
<head> ... </head>
<body> ... </body>
</html>
but <style> is NOT ALLOWED as a child of <html>. Instead you should have
<html>
<head>
<style> ... </style>
...
</head>
<body> ... </body>
</html>
If you don't have an explicit <head> element, better put one in.
<head> is where <style>, <meta>, <link>, <title>, and a few
other things can go.
line 56 column 1 - Warning: <table> lacks "summary" attribute=20
What it says: each <table> element should have a summary="text" attribute
saying what the table is about.
line 67 column 1 - Warning: <div> isn't allowed in <table> elements=20
What it says: <div> is NOT allowed inside a <table>.
<table> may contain these and ONLY these:
<caption> - a label for the whole table
<col> - a way of specifying attributes for an entire column
<colgroup> - a way of specifying attributes for a group of columns
<thead> - a group of rows to appear at the front, and if the table
is split across pages, at the top of each following part
of the table
<tfoot> - a group of rows to appear at the end, and if the table
is split across pages, at the foot of each preceding part
of the table
<tbody> - a group of rows, not to be repeated. The main part of the
table.
<tr> - a single row.
No other elements whatsoever are allowed as direct children of a <table>
element, they would make no sense at all.
line 69 column 5 - Warning: <td> repeated attribute=20
This means you have an element <td foo="X" foo="Y" ...>
where some attribute (foo in my example) occurs twice. This is not allowed
in any SGML document, and still isn't allowed in XML. (Although thanks to
the great "namespaces" cock-up, it _is_ possible to have two attributes with
the same URI and local name...)
line 222 column 1 - Warning: discarding unexpected </div>=20
Probably caused by all those illegal <div> start-tags.
Received on Sunday, 3 March 2002 20:47:52 UTC