- From: Thanasis Kinias <tkinias@optimalco.com>
- Date: Thu, 23 May 2002 06:56:59 -0700
- To: Philip Riebold <philip@livenet.ac.uk>
- Cc: www-validator@w3.org
scripsit Philip Riebold: > > > Surely this is a FAQ, but I've just found that the `HTML' output of > > > Microsoft Word doesn't validate with either the W3C or WDG validators: > > > > You are correct. Microsoft Word does not output valid HTML, nor does > > any Microsoft product of which I am aware. > > > > There used to be a program called the "demoronizer" which would clean up > > MSHTML to create something approximating valid HTML, but I don't know if > > it has kept up with recent versions of MS Office. The best way to get > > valid HTML from MS Word files is to save as plain text (ASCII or > > Unicode) and add the markup by hand. > > You could also try the superb HTML Tidy program which has a 'word-2000' > option for stripping out all the extraneous rubbish put in by MS. > > The program is described at, > > http://www.w3.org/People/Raggett/tidy/ Tidy is apparently now a SourceForge project [1]. Raggett's page directs you to SourceForge now (I believe this is quite new). Anyway, the latest on Tidy will be found there. References 1. <http://tidy.sourceforge.net> -- Thanasis Kinias Web Developer, Information Technology Graduate Student, Department of History Arizona State University Tempe, Arizona, U.S.A. Ash nazg durbatulūk, ash nazg gimbatul, Ash nazg thrakatulūk agh burzum-ishi krimpatul
Received on Thursday, 23 May 2002 09:58:15 UTC