- From: Fred <fred@gloryofgod.com>
- Date: Fri, 10 Jan 2003 09:03:50 -0800 (PST)
- To: Erwin Rollauer <erwin.rollauer@mcgill.ca>
- cc: html-tidy@w3.org
Hi Erwin, Turns out Microsoft Word produces html that is sub-standard, very sub-standard in many ways. But there are some configurable options in Tidy that may help you out, take a look at these. Word-2000 http://tidy.sourceforge.net/docs/quickref.html#word-2000 force-output http://tidy.sourceforge.net/docs/quickref.html#force-output bare http://tidy.sourceforge.net/docs/quickref.html#bare Also look at Microsofts own cleaning app http://office.microsoft.com/downloads/2000/Msohtmf2.aspx I have been working on a custom app to convert Word output to XHTML and learning alot about what it takes to clean up the junk and leave behind useful info. Cheers Fred On Fri, 10 Jan 2003, Erwin Rollauer wrote: > > I am currently evaluating Ultraedit and noticed the TIDY that came with > it. I tried it against a simple Micrsoft word 2002 "save as html" file > and got lots of errors. This is just a headup notice on the chance that > you have not tried it against miscrosoft generated code. > > > Erwin Rollauer > Senior Systems Analyst > Information Systems Resources > McGill University > 688 Sherbrooke St. West, Suite 500 > Montreal, QC H3A 3R1 > Tel: 514 398-5023 ex 00626 > Fax: 514 398-8252 > Email: erwin.rollauer@mcgill.ca > >
Received on Friday, 10 January 2003 12:04:22 UTC