W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2004

Re: Help with tidy?

From: Cory Nelson <phrosty@gmail.com>
Date: Wed, 7 Jul 2004 13:52:58 -0700
Message-ID: <9b1d061404070713522f9247bf@mail.gmail.com>
To: Paul Reger <paulr@olivetree.com>
Cc: html-tidy@w3.org

libxslt over at xmlsoft.org is a good one.  i think it comes with an
example app that will translate your documents.

google turns up TagSoup at http://mercury.ccil.org/~cowan/XML/tagsoup/

On Wed, 7 Jul 2004 13:37:02 -0700, Paul Reger <paulr@olivetree.com> wrote:
> Do you know of any free source bases for xslt or 'tag soup parser's
> written in C++?  Do you know where I could get it?
> 
> Thanks for the help.
> 
> Paul Reger
> Senior Software Engineer
> Olive Tree Bible Software
> paulr@olivetree.com
> Got a PDA?  Want a FREE Bible?  Goto:  www.olivetree.com
> 
> 
> 
> 
> -----Original Message-----
> From: Cory Nelson [mailto:phrosty@gmail.com]
> Sent: Wednesday, July 07, 2004 1:28 PM
> To: Paul Reger
> Cc: html-tidy@w3.org
> Subject: Re: Help with tidy?
> 
> Tidy isn't meant as an HTML library, you should probably use a tag soup
> parser for that. You could use tidy to convert your html to xml, then
> run that through xslt to whatever you want.
> 
> -xml makes tidy think the input is xml.  to convert from html to xml,
> use -asxhtml
> 
> the "unexpected </head> in <link>" etc is due to <link ...> not being
> valid xml, it needs to be <link ... />
> 
> ----- Original Message -----
> From: Paul Reger <paulr@olivetree.com>
> Date: Thu, 1 Jul 2004 11:37:22 -0700
> Subject: Help with tidy?
> To: html-tidy@w3.org
> 
> Hi,
> 
> I am a new user of Tidy.  I wish to use it as
> the basis for a parser of HTML documents.  The parser will be part of
> a conversion tool to convert from HTML to another markup language that
> is
> proprietary to our company..
> 
> I have some questions, and any help lent would be
> most appreciated.  If you could point me at documents or other code,
> that
> would be most helpful.
> 
> Tidy is reporting errors in a sample
> file that I am feeding it.  When I use the -xml switch, tidy
> reports the document with 4 errors and w/o the -xml switch, tidy reports
> the
> document has 1,481 errors.
> 
> When I do not include the -xml switch, tidy
> reports this one error (several times):
> 
> line 1275 column 7 - Error: <o:p> is not
> recognized!
> 
> When I do include the -xml switch, tidy
> reports the following 4 errors:
> 
> line 1268 column 1 - Error: unexpected
> </head> in <link>
> 
> line 15717 column 1 - Error: unexpected
> </div> in <hr>
> line 15719 column 1 - Error: unexpected
> </body> in <hr>
> line 15721 column 1 - Error: unexpected
> </html> in <hr>
> 
> Thanks in advance for any help,
> 
> Paul Reger (paulr@olivetree.com)
> Senior Software
> Engineer
> Olivetree Bible Software
> Got a PDA?  Want a free
> Bible?  Goto:
> www.olivetree.com
> 
>
Received on Wednesday, 7 July 2004 16:53:23 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:15:54 UTC