W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2001

RE: XML Tidy?

From: Ignacio Vazquez-Abrams <ignacio@openservices.net>
Date: Wed, 20 Jun 2001 10:13:07 -0400 (EDT)
To: <html-tidy@w3.org>
Message-ID: <Pine.LNX.4.33.0106200955350.6807-200000@terbidium.openservices.net>
On Tue, 19 Jun 2001, Reitzel, Charlie wrote:

> Great idea.  I'd try a somewhat different tact, tho.  Rather than mess w/
> the parser _or_ your input.  I'd just feed the prospective <body> contents
> to Tidy as is.  Let the parser merrily add
> <head><title></title></head><body> your stuff! </body>.  Then, if the
> --body-only option is set, call  a new PPrintContent() function in pprint.c.
> This function will call PPrintTree() for each member of the body->content
> node list.
>
> The important point here is you can safely add functionality without
> changing w/ the inner workings of the parser.  Even better, it will report
> the line numbers correctly for any errors/warnings it emits.
>
> take it as easy as you can stand it (which is qualified by noting that I am
> an uptight east coaster.  It's one those "do as I say, not as I do" kind of
> things),
>
> Charlie

This one should do it then. No more segfaults, but I did have to muck with the
parser in order to suppress the "no title" error.

A nice side bonus is that it will gladly take a normal HTML page and chop out
everything but the body, which means I can get rid of the part in the
applcation that does it manually when importing a file.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>


Received on Wednesday, 20 June 2001 10:13:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:45 GMT