Re: tidy - ? html tree number

Hi, Gary

Thanks a lot. 

But, in my out0.html, there is also a blank line in between </script> and
<title>, and the child number for <head> is not affected. Why?
 
Why the following in out0.html can cause the child number of <body> be
affected? Can we have some method in Tidy to avoid this? (I mean I can use
such method in my TestDOM.java to process out0.html without change to
out0.html itself.)

... NEW Home</a>
<p><font ...

Actually, the out0.html is generated by running a Test16.java which is
almost copied from example java class Test16 in jtidy.html. I attach this
Test16.java and jtidy.html. My running for this Test16 is 
"java Test16 http://www.usc.edu/dept/cs out0.html". I don't know whether i
also miss something in Test16 also.

i paste the out0.html as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<script type="text/javascript" language="JavaScript">
location = "http://www.cs.usc.edu/"
</script>

<title></title>
</head>
<body bgcolor="#000000" font="#FFFFFF">
If you were not automatically redirected to the new USC Computer
Science WebSite<br>
Please go to: <a href="http://www.cs.usc.edu/">USC Computer Science
NEW Home</a>
<p><font size="+1" color="#EE0000">Thank You</font></p>
</body>
</html>


Thanks.  


Major






On Tue, 9 Jan 2001, Gary L Peskin wrote:

> Chunbo Shao wrote:
> > 
> > Hello, Gary
> > 
> > very sorry to trouble you again. I already have no food or drink for
> > almost the whole day up to now. Would you please still help me?
> 
> Major --
> 
> No trouble at all.  But you should take care of yourself.  It will catch
> up with you!
> 
> > As to your guide, i comment the setXmlTags() line. The result in the
> > attached treef.txt looks very good now. But, please see the attached
> > out0.html which is used as the input for TestDOM.java, the children of
> > <body> should be 5, but in the output treef.txt, it is 6. Why? how to fix
> > it?
> 
> No, it's six, see below.
> 
> > i think <body> has 5 children:
> >  "If you were not...", <br>, "Please go to:", <a...>,<p....>.
> > am i right?
> 
> There is a text node consisting of a single blank between the <a...> and
> the <p...>.
> If you want to eliminate it, change your html to:
> 
> from
> 
> ... NEW Home</a>
> <p><font ...
> 
> to 
> 
> ... NEW Home</a><p><font ...
> 
> HTH,
> Gary
> 
> 

Received on Tuesday, 9 January 2001 13:54:53 UTC