W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2007

Re: tidying an mbox file?

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 06 Feb 2007 19:56:33 +0100
To: Miles Fidelman <mfidelman@meetinghouse.net>
Cc: html-tidy@w3.org
Message-ID: <aljhs2h6c3p5fts0q8h534e763b9vplk6a@hive.bjoern.hoehrmann.de>

* Miles Fidelman wrote:
>Which leads me to following question:  Does anybody have any experience 
>and/or suggestions on how to process an mbox file to clean up HTML 
>that's embedded in mail messages?

You would need some tool that lets you apply Tidy to the text/html parts
of the mail bodies. There are various modules for MIME, mbox, and Tidy
handling on CPAN if you happen to be a Perl programmer. I am not sure
how the yahoo2mbox script works, but it might also make sense to adapt
it so that it calls Tidy appropriately.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Tuesday, 6 February 2007 18:56:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT