- From: Cory Nelson <phrosty@gmail.com>
- Date: Wed, 7 Jul 2004 13:27:31 -0700
- To: Paul Reger <paulr@olivetree.com>
- Cc: html-tidy@w3.org
Tidy isn't meant as an HTML library, you should probably use a tag soup parser for that. You could use tidy to convert your html to xml, then run that through xslt to whatever you want. -xml makes tidy think the input is xml. to convert from html to xml, use -asxhtml the "unexpected </head> in <link>" etc is due to <link ...> not being valid xml, it needs to be <link ... /> ----- Original Message ----- From: Paul Reger <paulr@olivetree.com> Date: Thu, 1 Jul 2004 11:37:22 -0700 Subject: Help with tidy? To: html-tidy@w3.org Hi, I am a new user of Tidy. I wish to use it as the basis for a parser of HTML documents. The parser will be part of a conversion tool to convert from HTML to another markup language that is proprietary to our company.. I have some questions, and any help lent would be most appreciated. If you could point me at documents or other code, that would be most helpful. Tidy is reporting errors in a sample file that I am feeding it. When I use the -xml switch, tidy reports the document with 4 errors and w/o the -xml switch, tidy reports the document has 1,481 errors. When I do not include the -xml switch, tidy reports this one error (several times): line 1275 column 7 - Error: <o:p> is not recognized! When I do include the -xml switch, tidy reports the following 4 errors: line 1268 column 1 - Error: unexpected </head> in <link> line 15717 column 1 - Error: unexpected </div> in <hr> line 15719 column 1 - Error: unexpected </body> in <hr> line 15721 column 1 - Error: unexpected </html> in <hr> Thanks in advance for any help, Paul Reger (paulr@olivetree.com) Senior Software Engineer Olivetree Bible Software Got a PDA? Want a free Bible? Goto: www.olivetree.com
Received on Wednesday, 7 July 2004 16:27:53 UTC