- From: Sebastian Lange <lange@cyperfection.de>
- Date: Wed, 23 Aug 2000 09:19:35 +0200
- To: <html-tidy@w3.org>
In Perl5 [http://www.cpan.org/ports/], $foo =~ s/<SCRIPT[^>]*>.*?<\/SCRIPT[^>]*>//gsi; ... would strip all script tags (including content), $foo =~ s/(<SCRIPT[^>]*>).*?(<\/SCRIPT[^>]*>)/$1$2/gsi; ... should remove the content of all script tags but leave the tags untouched. cheers, sebastian At 00:50 23.08.2000 -0400, Jelks Cabaniss wrote: >Vipul Veera wrote: > > > Where do i modify tidy so that the program strips everything between > > <script> and </script> > >Tidy doesn't do that. You'll have to do some pre or post-processing >yourself. > >It might be nice if Tidy had an option like ... > > delete-elements: marquee, blink > >but it opens up a can of worms. For example, if you say ... > > delete-elements: p > >does it delete the *contents* of all the paragraphs, or just "strip the >tags"? (In the script example, you wanted the contents removed, but that >wouldn't necessarily always be the case). And to make it useful, you'd also >have to be able to select by attribute ... > > delete-elements: div[class="foo"], span[class="bar"] > >You can see that adding this option wouldn't quite be a 15-minute hack. :) >You can get what you want today by just piping Tidy's output to an XSLT >stylesheet, a Perl script, etc. > >PS. A "transform-elements" option would be awful nice too: > > transform-elements: p[class="foo"]/p[class="bar"], ... > >Would save writing an awful lot of XSLT stylesheets, scripts, etc. ... > > >/Jelks -- Sebastian Lange http://www.sl-chat.de/ Maybe the first chat site that validates as HTML 4.0 even though user input may contain HTML codes. Courtesy to Dave Raggett's HTML Tidy: http://www.w3.org/People/Raggett/tidy/ Tidy your documents ONLINE: http://www.sl-chat.de/Tidy/
Received on Wednesday, 23 August 2000 03:22:43 UTC