- From: Cory Nelson <phrosty@gmail.com>
- Date: Wed, 7 Jul 2004 13:52:58 -0700
- To: Paul Reger <paulr@olivetree.com>
- Cc: html-tidy@w3.org
libxslt over at xmlsoft.org is a good one. i think it comes with an example app that will translate your documents. google turns up TagSoup at http://mercury.ccil.org/~cowan/XML/tagsoup/ On Wed, 7 Jul 2004 13:37:02 -0700, Paul Reger <paulr@olivetree.com> wrote: > Do you know of any free source bases for xslt or 'tag soup parser's > written in C++? Do you know where I could get it? > > Thanks for the help. > > Paul Reger > Senior Software Engineer > Olive Tree Bible Software > paulr@olivetree.com > Got a PDA? Want a FREE Bible? Goto: www.olivetree.com > > > > > -----Original Message----- > From: Cory Nelson [mailto:phrosty@gmail.com] > Sent: Wednesday, July 07, 2004 1:28 PM > To: Paul Reger > Cc: html-tidy@w3.org > Subject: Re: Help with tidy? > > Tidy isn't meant as an HTML library, you should probably use a tag soup > parser for that. You could use tidy to convert your html to xml, then > run that through xslt to whatever you want. > > -xml makes tidy think the input is xml. to convert from html to xml, > use -asxhtml > > the "unexpected </head> in <link>" etc is due to <link ...> not being > valid xml, it needs to be <link ... /> > > ----- Original Message ----- > From: Paul Reger <paulr@olivetree.com> > Date: Thu, 1 Jul 2004 11:37:22 -0700 > Subject: Help with tidy? > To: html-tidy@w3.org > > Hi, > > I am a new user of Tidy. I wish to use it as > the basis for a parser of HTML documents. The parser will be part of > a conversion tool to convert from HTML to another markup language that > is > proprietary to our company.. > > I have some questions, and any help lent would be > most appreciated. If you could point me at documents or other code, > that > would be most helpful. > > Tidy is reporting errors in a sample > file that I am feeding it. When I use the -xml switch, tidy > reports the document with 4 errors and w/o the -xml switch, tidy reports > the > document has 1,481 errors. > > When I do not include the -xml switch, tidy > reports this one error (several times): > > line 1275 column 7 - Error: <o:p> is not > recognized! > > When I do include the -xml switch, tidy > reports the following 4 errors: > > line 1268 column 1 - Error: unexpected > </head> in <link> > > line 15717 column 1 - Error: unexpected > </div> in <hr> > line 15719 column 1 - Error: unexpected > </body> in <hr> > line 15721 column 1 - Error: unexpected > </html> in <hr> > > Thanks in advance for any help, > > Paul Reger (paulr@olivetree.com) > Senior Software > Engineer > Olivetree Bible Software > Got a PDA? Want a free > Bible? Goto: > www.olivetree.com > >
Received on Wednesday, 7 July 2004 16:53:23 UTC