RE: XML Tidy? from Ignacio Vazquez-Abrams on 2001-06-20 (html-tidy@w3.org from April to June 2001)

From: Ignacio Vazquez-Abrams <ignacio@openservices.net>
Date: Wed, 20 Jun 2001 10:13:07 -0400 (EDT)
To: <html-tidy@w3.org>
Message-ID: <Pine.LNX.4.33.0106200955350.6807-200000@terbidium.openservices.net>

On Tue, 19 Jun 2001, Reitzel, Charlie wrote:

> Great idea.  I'd try a somewhat different tact, tho.  Rather than mess w/
> the parser _or_ your input.  I'd just feed the prospective <body> contents
> to Tidy as is.  Let the parser merrily add
> <head><title></title></head><body> your stuff! </body>.  Then, if the
> --body-only option is set, call  a new PPrintContent() function in pprint.c.
> This function will call PPrintTree() for each member of the body->content
> node list.
>
> The important point here is you can safely add functionality without
> changing w/ the inner workings of the parser.  Even better, it will report
> the line numbers correctly for any errors/warnings it emits.
>
> take it as easy as you can stand it (which is qualified by noting that I am
> an uptight east coaster.  It's one those "do as I say, not as I do" kind of
> things),
>
> Charlie

This one should do it then. No more segfaults, but I did have to muck with the
parser in order to suppress the "no title" error.

A nice side bonus is that it will gladly take a normal HTML page and chop out
everything but the body, which means I can get rid of the part in the
applcation that does it manually when importing a file.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>

Attachments

TEXT/PLAIN attachment: tidy4aug00-bodyonly2.patch

Received on Wednesday, 20 June 2001 10:13:10 UTC