Re: tidy - ? html tree number

Chunbo Shao wrote:
> But, in my out0.html, there is also a blank line in between </script> and
> <title>, and the child number for <head> is not affected. Why?

Major --

<head> is not allowed to have text content so any text content in the
<head> element is just ignored.  That's why the blank line does not
create a text node child of head.

> Why the following in out0.html can cause the child number of <body> be
> affected? Can we have some method in Tidy to avoid this? (I mean I can use
> such method in my TestDOM.java to process out0.html without change to
> out0.html itself.)
> 
> ... NEW Home</a>
> <p><font ...

It causes the child number to be affected because an actual text node
child is created in the HTML parse tree.  If you want to filter out text
nodes containing only whitespace, you can modify your java code to
ignore those whitespace-only text nodes in the count.  I do not believe
that Tidy itself has this feature.

> Actually, the out0.html is generated by running a Test16.java which is
> almost copied from example java class Test16 in jtidy.html. I attach this
> Test16.java and jtidy.html. My running for this Test16 is
> "java Test16 http://www.usc.edu/dept/cs out0.html". I don't know whether i
> also miss something in Test16 also.

This looks okay to me at a casual glance.  Did you have a particular
question?

Gary

Received on Tuesday, 9 January 2001 14:23:38 UTC