Re: Microsoft Word from Office 2000 `HTML' fails to validate

> > Surely this is a FAQ, but I've just found that the `HTML' output of
> > Microsoft Word doesn't validate with either the W3C or WDG validators:
> 
> You are correct.  Microsoft Word does not output valid HTML, nor does
> any Microsoft product of which I am aware.
> 
> There used to be a program called the "demoronizer" which would clean up
> MSHTML to create something approximating valid HTML, but I don't know if
> it has kept up with recent versions of MS Office.  The best way to get
> valid HTML from MS Word files is to save as plain text (ASCII or
> Unicode) and add the markup by hand.

You could also try the superb HTML Tidy program which has a 'word-2000'
option for stripping out all the extraneous rubbish put in by MS. 

The program is described at,

	http://www.w3.org/People/Raggett/tidy/


TTFN,

   Philip Riebold                                /"\
   Media Resources                               \ /
   University College London                      X  ASCII Ribbon Campaign
   Windeyer Building, 46 Cleveland Street        / \ Against HTML Mail
   London, W1T 4JF
   +44 (0)20 7580 9872
   http://www.ucl.ac.uk/mediares/vconf

Received on Thursday, 23 May 2002 06:09:46 UTC