RE: XML Tidy? from Reitzel, Charlie on 2001-06-19 (html-tidy@w3.org from April to June 2001)

From: Reitzel, Charlie <CReitzel@arrakisplanet.com>
Date: Tue, 19 Jun 2001 15:14:40 -0400
To: "'Ignacio Vazquez-Abrams'" <ignacio@openservices.net>, html-tidy@w3.org
Message-ID: <B5C79DDBC655D311B6BD0008C7E64D76013C1629@exchange.arrakisplanet.com>

I have the same type of thing in mind.  It is the kind of thing you should
be able to accomplish with a library version of Tidy.

In the meantime, if you preserve the original input, you should be able to
simply subtract N lines from the _reported_ line numbers  - that is, I am
assuming, the N lines of header you put in front of the user's input.

take it easy,
Charlie

-----Original Message-----
From: Ignacio Vazquez-Abrams [mailto:ignacio@openservices.net]
Sent: Tuesday, June 19, 2001 10:02 AM
To: html-tidy@w3.org
Subject: Re: XML Tidy?

On Mon, 18 Jun 2001, Klaus Johannes Rusch wrote:

> In <Pine.LNX.4.33.0106181025230.30759-100000@terbidium.openservices.net>, 
> Ignacio Vazquez-Abrams <ignacio@openservices.net> writes:
> > I was wondering if there exists any version or variant 
> > or configuration of Tidy which could deal with an XML/HTML
> > hybrid? More specifically I need to just deal with the 
> > stuff that would appear inside the BODY tag, without adding
> > the HTML, HEAD, and TITLE tags. I have tried a lot of 
> > configuration options for HTML Tidy, but have had no 
> > success so far.
>
> You can either use the -xml option to only process the 
> fragment as an XML fragment, however this will not do 
> any of the usual HTML cleanup.

The problem is that I need to do the HTML cleanup; I need to clean up a
pseudoHTML document entered by the user, and this document will only contain
a piece of an HTML page.

> Or, run the fragment through tidy using the -asxml 
> option, then extract everything between <body> and </body>.

While that works for the output stage (Oh no! select="html/body"! The
horror! :P ), I would also like to provide entry-time verification and
cleanup of code. Having to search for /line ([0-9]+) / and subtracting when
displaying errors to the user, while not difficult, is something I'd like to
avoid.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>

Received on Tuesday, 19 June 2001 15:14:26 UTC