W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2011

Cleaning up Outlook Emails -- is there hope?

From: Ray Van Dolson <rvandolson@esri.com>
Date: Mon, 25 Apr 2011 15:31:57 -0700
To: html-tidy@w3.org
Message-ID: <20110425223157.GA6670@esri.com>
I'm a mutt user in an Outlook world and have been trying to find a way
to clean up Outlook-generated HTML for view in my mutt client.

I've mostly been doing this via w3m -dump, but am really looking for a
solution to lots of double spaced text.

MS encloses everything in <p></p> tags -- so a couple lines of text
might be intended to look like:

  IP Address: 2.2.2.2
  Gateway: 2.2.2.1

But ends up getting rendered as:

  IP Address: 2.2.2.2

  Gateway: 2.2.2.1

due to <p></p> tags being around each line instead of the block as a
whole and using <br /> where carriage returns are.

MS of course includes a stylesheet to set margins to 0 so this looks
fine in Outlook, Thunderbird, etc...

The --word2000 option for "tidy" helps clean up the cruft, but the para
tags still are there.  It doesn't appear that w3m honors margin: 0 in
the embedded CSS either...

Anyone out there know any magic to make this look better in plain text
via tidy or some other tool?

Thanks!
Ray
Received on Thursday, 28 April 2011 19:30:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:14:01 GMT