- From: <patricka@mkdoc.com>
- Date: Fri, 07 Nov 2003 09:37:19 +0000
- To: html-tidy@w3.org
Cristian Balan writes: > I been using Tidy to clean Word 2000 documents and get them ready for the > Web. > Tidy seems to be doing a great job, the only tags that are left that I still > want to get rid of are the class attributes: > > <body class='c10'> > <div class="Section1"> > > <li class="c4"> > > How can I do this either in the UI for Win32 or command line Tidy? i don't think this is possible[1]. :( try either: - textism's word html cleaner[2], or - roll your own perl solution with MKDoc::XML::Stripper[3] warning: the perl solution requires xml input, so you'll need to run it through tidy first with the output-xhtml option (if you're throwing it html). hth, - p 1. http://tidy.sourceforge.net/docs/quickref.html 2. http://www.textism.com/resources/cleanwordhtml/ 3. http://search.cpan.org/~jhiver/MKDoc-XML/lib/MKDoc/XML/Stripper.pm
Received on Friday, 7 November 2003 04:37:29 UTC