- From: William Bagby <williamb@adone.com>
- Date: Tue, 1 May 2001 15:18:34 -0400
- To: "Tidy Mailing List (E-mail)" <html-tidy@w3c.org>
Here's what I want to do: I have a block of text which has HTML markup in it. It is possible that it is not strictly valid HTML due to non-escaped special characters such as <, >, &, etc. I would like to make it well-formed XML. For example, I have the following: Looking for a 1976 Chevy convertible < $2000, with power windows & AC.<br>Please <a href="mailto:myaddress@mydomain.com">e-mail me</a>. and would like it converted to: Looking for a 1976 Chevy convertible < $2000, with power windows & AC.<br />Please <a href="mailto:myaddress@mydomain.com">e-mail me</a>. While I realize that Tidy is capable of translating an HTML page into well-formed XML with the -asxml flag, it also adds all of the other HTML tags to make it a "complete" HTML page, such as <html>, <head>, <body>, etc., and I do not want these tags there because I am inserting the fragment into an XML page after processing. Question is, is there a simple way, either from the command-line or within a configuration file, to tell Tidy *not* to insert the extra tags? Or do I need to modify the source code to accomplish this? BTW, I'm using JTidy. Thanks, William.
Received on Tuesday, 1 May 2001 15:25:13 UTC