RE: XML Tidy?

I have the same type of thing in mind.  It is the kind of thing you should
be able to accomplish with a library version of Tidy.

In the meantime, if you preserve the original input, you should be able to
simply subtract N lines from the _reported_ line numbers  - that is, I am
assuming, the N lines of header you put in front of the user's input.

take it easy,

-----Original Message-----
From: Ignacio Vazquez-Abrams []
Sent: Tuesday, June 19, 2001 10:02 AM
Subject: Re: XML Tidy?

On Mon, 18 Jun 2001, Klaus Johannes Rusch wrote:

> In <>, 
> Ignacio Vazquez-Abrams <> writes:
> > I was wondering if there exists any version or variant 
> > or configuration of Tidy which could deal with an XML/HTML
> > hybrid? More specifically I need to just deal with the 
> > stuff that would appear inside the BODY tag, without adding
> > the HTML, HEAD, and TITLE tags. I have tried a lot of 
> > configuration options for HTML Tidy, but have had no 
> > success so far.
> You can either use the -xml option to only process the 
> fragment as an XML fragment, however this will not do 
> any of the usual HTML cleanup.

The problem is that I need to do the HTML cleanup; I need to clean up a
pseudoHTML document entered by the user, and this document will only contain
a piece of an HTML page.

> Or, run the fragment through tidy using the -asxml 
> option, then extract everything between <body> and </body>.

While that works for the output stage (Oh no! select="html/body"! The
horror! :P ), I would also like to provide entry-time verification and
cleanup of code. Having to search for /line ([0-9]+) / and subtracting when
displaying errors to the user, while not difficult, is something I'd like to

Ignacio Vazquez-Abrams  <>

Received on Tuesday, 19 June 2001 15:14:26 UTC