- From: Ben Noblet <ben@lateralsystems.com.au>
- Date: Sat, 8 Nov 2003 00:38:00 +1100
- To: <html-tidy@w3.org>
A roll your own solution using Regular expressions could be something as simple as this ... (example in javascript) function stripClass(content) { oReg = new RegExp("(<[^>]+) class=[^ |^>]*([^>]*>)","ig"); return content.replace(oReg, "$1 $2"); } content = stripClass('This is some HTML code <p align="center" class="Rubbish">Text</p>'); Cheers Ben > -----Original Message----- > From: html-tidy-request@w3.org > [mailto:html-tidy-request@w3.org] On Behalf Of patricka@mkdoc.com > Sent: Friday, 7 November 2003 8:37 PM > To: html-tidy@w3.org > Subject: Re: Stipping classes from HTML > > > Cristian Balan writes: > > > I been using Tidy to clean Word 2000 documents and get them > ready for the > > Web. > > Tidy seems to be doing a great job, the only tags that are > left that I still > > want to get rid of are the class attributes: > > > > <body class='c10'> > > <div class="Section1"> > > > > <li class="c4"> > > > > How can I do this either in the UI for Win32 or command line Tidy? > > i don't think this is possible[1]. :( > > try either: > > - textism's word html cleaner[2], or > - roll your own perl solution with MKDoc::XML::Stripper[3] > > warning: the perl solution requires xml input, so you'll need > to run it > through tidy first with the output-xhtml option (if you're > throwing it > html). > > hth, > > - p > > 1. http://tidy.sourceforge.net/docs/quickref.html > 2. http://www.textism.com/resources/cleanwordhtml/ > 3. http://search.cpan.org/~jhiver/MKDoc-XML/lib/MKDoc/XML/Stripper.pm > > > >
Received on Friday, 7 November 2003 08:33:30 UTC