- From: Richard A. O'Keefe <ok@cs.otago.ac.nz>
- Date: Mon, 4 Mar 2002 14:47:48 +1300 (NZDT)
- To: html-tidy@w3.org, qpoint@bigfoot.com
"Raymond Tan" <qpoint@bigfoot.com> wrote: Hello everyone.=20 I am new to jTidy. I am using jTidy to convert parse a HTML from a URL = to XML. but I am getting a lot of errors. I urgently need this function = for my school project. Appreciate any help rendered!=20 You don't show us the page you are converting. the following are the generated error log from jTidy.=20 Tidy (vers 4th August 2000) Parsing "InputStream"=20 line 11 column 1 - Warning: <style> isn't allowed in <html> elements=20 This means you have <html> <style> ... </style> <head> ... </head> <body> ... </body> </html> but <style> is NOT ALLOWED as a child of <html>. Instead you should have <html> <head> <style> ... </style> ... </head> <body> ... </body> </html> If you don't have an explicit <head> element, better put one in. <head> is where <style>, <meta>, <link>, <title>, and a few other things can go. line 56 column 1 - Warning: <table> lacks "summary" attribute=20 What it says: each <table> element should have a summary="text" attribute saying what the table is about. line 67 column 1 - Warning: <div> isn't allowed in <table> elements=20 What it says: <div> is NOT allowed inside a <table>. <table> may contain these and ONLY these: <caption> - a label for the whole table <col> - a way of specifying attributes for an entire column <colgroup> - a way of specifying attributes for a group of columns <thead> - a group of rows to appear at the front, and if the table is split across pages, at the top of each following part of the table <tfoot> - a group of rows to appear at the end, and if the table is split across pages, at the foot of each preceding part of the table <tbody> - a group of rows, not to be repeated. The main part of the table. <tr> - a single row. No other elements whatsoever are allowed as direct children of a <table> element, they would make no sense at all. line 69 column 5 - Warning: <td> repeated attribute=20 This means you have an element <td foo="X" foo="Y" ...> where some attribute (foo in my example) occurs twice. This is not allowed in any SGML document, and still isn't allowed in XML. (Although thanks to the great "namespaces" cock-up, it _is_ possible to have two attributes with the same URI and local name...) line 222 column 1 - Warning: discarding unexpected </div>=20 Probably caused by all those illegal <div> start-tags.
Received on Sunday, 3 March 2002 20:47:52 UTC