- From: Lee Passey <lee@novomail.net>
- Date: Thu, 07 May 2009 10:17:12 -0600
- To: Ivor O'Connor <ivor.oconnor@gmail.com>
- CC: html-tidy@w3.org
Ivor O'Connor wrote: [snip] > Hmmm. A tool for pretty printing html and no support for what is almost > always contained in html? I think this question demonstrates the fundamentally incorrect assumption underlying your request. Tidy is /not/ a tool for pretty printing HTML. Tidy is a tool for evaluating HTML and automatically correcting invalid HTML when possible. As it evaluates the HTML Tidy builds in memory a DOM tree representation of the HTML. Some corrections are performed during parsing, and other corrections are made to the in-memory DOM. At the end of processing, the corrected HTML exists only as an in-memory DOM. A pretty-print routine is required when simply for outputting the in-memory representation. But... Tidy's pretty print functionality is a consequence of its method of evaluation and correction, and not the purpose of the program. It is perhaps unfortunate that Tidy's pretty print function is so good that it has led people to believe that reformatting was, in fact, the original design goal, despite the fact that it wasn't. Following the usual Linux/Unix practice, Tidy can read from stdin, and write to stdout. If you need a true pretty print function you should try to find one that can also read from stdin and then use Tidy to validate/correct your HTML, piping the output from Tidy to the pretty print program. Because XHTML output from Tidy is guaranteed to be well-formed, just about any XML pretty print program should be able to give you what you need.
Received on Thursday, 7 May 2009 16:17:58 UTC