W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2005

Re: Problem with Microsoft Word and Tidy

From: John Campbell <jdc.rpv@cox.net>
Date: Sat, 06 Aug 2005 03:22:31 -0700
Message-ID: <42F48F67.9080503@cox.net>
To: David Wilczynski <dwilczyn@usc.edu>
CC: html-tidy@w3.org

David Wilczynski wrote:

>
> Don't know if this a bug, but it stops me from using Tidy. Microsoft 
> Word does many things to HTML documents, many of which Tidy does fix, 
> including changing "\" to "/" in url's. However, in most of my tables, 
> Word insists on putting in the following:
>
> <![if !supportEmptyParas]>&nbsp;<![endif]><o:p></o:p>
>
> I don't know how it gets there. When I remove it, it puts it back. 
> Tidy calls these errors that must be fixed before it generates new 
> cleaned up output. That negates the value of Tidy.
>
> Suggestions?

I know what you mean.  What I ended up doing was to add the following 
line to my tidy.conf:

new-blocklevel-tags: st1:date, st1:city, st1:country-region, st1:place, 
st1:time, o:p, o:smarttagtype, st1:placename, st1:placetype, st1:street, 
st1:address, st1:state, st2:place, st2:placename, st2:placetype, 
st2:city, st2:street, st2:address, st2:time, st2:state, 
st2:country-region, quote, dt, dd

And add new tags every time I hit another one...  I wish tidy would just 
strip the "st?:" and "o:" McTags when "word-2000: yes" is chosen...

I also wish there was a "strip javascript" option... and maybe a "strip 
everything except the following..." option.
Received on Saturday, 6 August 2005 10:22:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:55 GMT