W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2001

jtidy question

From: Daniel F Lim <limd@cs.ucdavis.edu>
Date: Tue, 29 May 2001 12:14:44 -0400 (EDT)
To: html-tidy@w3.org
Message-ID: <Pine.HPP.4.10.10105290911590.1200-100000@hp9.cs.ucdavis.edu>

i'm working with your jtidy package and the parser seems to work fine.
i'm currently trying to parse an HTML file using jtidy.parseDOM.  with the
resulting DOM tree structure i try to create new elements and rearrange
the tree but i continue to have problems.

import java.io.*;
import org.w3c.dom.*;
import org.w3c.tidy.*;

public class JtidyTest {
public static void main(String args[]) throws Exception {
        String str = "<html><head><title>test</title></head>\n";
        str = str.concat("<body bgcolor=#FFFFFF><font size=\"3\">\n");
        str = str.concat("this</font> is strange ?</body></html>\n");
        BufferedInputStream in = new BufferedInputStream(new
StringBufferInputStream(str));

        Tidy tidy = new Tidy();
        tidy.setXmlOut(true);
        tidy.setErrout(new PrintWriter(new FileWriter("errors"),true));
        Document root = tidy.parseDOM(in,new
FileOutputStream("output.html"));

        org.w3c.dom.Node target = root.getFirstChild().getFirstChild();

        Element anchor = root.createElement("a");
        anchor.setAttribute("href", "http://www.google.com");
        target.getParentNode().insertBefore(anchor, target);
        anchor.appendChild(target);

        tidy.pprint(root, System.out);
    }
}



this produces:

<html>
<a href="http://www.google.com" />
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org" />
<title>test</title>
</head>
<body bgcolor="#FFFFFF">
<font size="3">this</font> is strange ?
</body>
</html>

but shouldn't the <a href="http://www.google.com"> element surround the
<head> element instead of being a separate element?  shouldn't it produce
the following:

<a href="http://www.google.com">
<head>
<meta ...>
<title>test</title>
</head>
</a>

thanks
Received on Tuesday, 29 May 2001 12:51:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:45 GMT