[whatwg] converting word (was <code> attributes

On Fri, 01 May 2009 12:22:32 +0100, Adrian Sutton  
<adrian.sutton at ephox.com> wrote:


> The biggest challenge in this is actually removing the huge amount of  
> inline
> formatting and proprietary tags/attributes that Microsoft Word adds.  In  
> the
> latest versions it's also a challenge to put lists back together as  
> actual
> HTML lists since Word has started exporting them as paragraphs with a  
> bullet
> from the symbol font and lots of nbsps.

Off topic, I know - but couldn't a VBA macro hook into word and actually  
make an "export as semantic html" option that exported the heading levels  
as h1..h6, honoured bold, italics, links, bullets and numbers as ul and  
ol, and just ignored all colours, font changes etc. So there is nothing to  
clean up?

bruce

Received on Friday, 1 May 2009 04:27:24 UTC