Re: error when clean html with tidy (fwd)

Hi, Gary

The java file I used to invoke tidy is almost the same one as the example
file offered by Tidy package. It is attached. I am not too sure now
whether i need to set XMLtags to be false explicitly. Because I think the
default for it is already false, is it?

the error message is shown in err output file from tidy. The result is
that the cleaned file as output will be zero length, which i don't like.

thanks for help.

Chunbo





On Thu, 23 Aug 2001, Gary L Peskin wrote:

> Please show how you're invoking Tidy and what the exact error messages
> are.
> 
> Thanks,
> Gary
> 
> 
> Chunbo Shao wrote:
> > 
> > thanks for reply.
> > 
> > Attached files are some html page on some university. I didn't make the
> > page. I just use Tidy to parse it.
> > 
> > The file "48-washington.edu" gave the error "Error: <meta> missing '>' for
> > end of tag".
> > 
> > The file "42-upenn.edu" gave me the error "Error: <a> missing '>' for end
> > of tag".
> > 
> > At the beginning of each file, you can see the url link address for this
> > url file. I already took out these extra lines before I use Tidy to clean
> > this url file.
> > 
> > I cannot see any clue to figure out why the error happans.
> > thanks for help.
> > 
> > chunbo
> > 
> > On Thu, 23 Aug 2001, Reitzel, Charlie wrote:
> > 
> > > Can you send a snippet of your HTML w/ the <meta> and <a> tags that Tidy is
> > > complaining about?  You may unbalanced quotes or some other problem that has
> > > confused it.
> > >
> > > -----Original Message-----
> > > From: Chunbo Shao [mailto:cxs0187@omega.uta.edu]
> > > Sent: Thursday, August 23, 2001 5:43 PM
> > > To: mrbannon@student.math.uwaterloo.ca
> > > Cc: html-tidy@w3.org
> > > Subject: error when clean html with tidy (fwd)
> > >
> > >
> > > Hi,
> > >
> > > almost same thing, error shows
> > > "<meta> missing '>' for end of tag".
> > >
> > > But, "meta" is already in TagTable.java.
> > >
> > > Can we do something (to make tidy) to solve this problem, then to give
> > > nice output other than zero-content file?
> > >
> > > thanks.
> > >
> > > Chunbo
> > >
> > >
> > > ---------- Forwarded message ----------
> > > Date: Thu, 23 Aug 2001 16:23:43 -0500 (CDT)
> > > From: Chunbo Shao <cxs0187@omega.uta.edu>
> > > To: Michael Ryan Bannon <mrbannon@student.math.uwaterloo.ca>
> > > Cc: html-tidy@w3.org
> > > Subject: error when clean html with tidy
> > >
> > > Hi, Michael
> > >
> > > Thanks for your help on "config.txt". It's good solution.
> > >
> > > When i run tidy to clean some html, i got one error indicating that
> > > "<a> missing '>' for end of tag ". But "<a>" is already included in
> > > TagTable.java.
> > > Because of this error, the output as clean result is a zero-length file.
> > > But i want the output file not to be a zero-content file.
> > >
> > > Is there any solution to avoid this? Tidy is supposed to overcome this
> > > case, is it?
> > >
> > > chunbo
> > >
> > >
> > >
> > 
> >   ------------------------------------------------------------------------
> >                         Name: 48-washington.edu
> >    48-washington.edu    Type: Plain Text (TEXT/PLAIN)
> >                     Encoding: BASE64
> > 
> >                    Name: 42-upenn.edu
> >    42-upenn.edu    Type: Plain Text (TEXT/PLAIN)
> >                Encoding: BASE64
> 

Received on Friday, 24 August 2001 15:47:30 UTC