RE: eliminate <script> tags

In Perl5 [http://www.cpan.org/ports/],

$foo =~ s/<SCRIPT[^>]*>.*?<\/SCRIPT[^>]*>//gsi;
... would strip all script tags (including content),

$foo =~ s/(<SCRIPT[^>]*>).*?(<\/SCRIPT[^>]*>)/$1$2/gsi;
... should remove the content of all script tags but leave the tags untouched.


cheers,

sebastian

At 00:50 23.08.2000 -0400, Jelks Cabaniss wrote:
>Vipul Veera wrote:
>
> > Where do i modify tidy so that the program strips everything between
> > <script> and </script>
>
>Tidy doesn't do that.  You'll have to do some pre or post-processing
>yourself.
>
>It might be nice if Tidy had an option like ...
>
>         delete-elements: marquee, blink
>
>but it opens up a can of worms.  For example, if you say ...
>
>         delete-elements: p
>
>does it delete the *contents* of all the paragraphs, or just "strip the
>tags"?  (In the script example, you wanted the contents removed, but that
>wouldn't necessarily always be the case).  And to make it useful, you'd also
>have to be able to select by attribute ...
>
>         delete-elements: div[class="foo"], span[class="bar"]
>
>You can see that adding this option wouldn't quite be a 15-minute hack. :)
>You can get what you want today by just piping Tidy's output to an XSLT
>stylesheet, a Perl script, etc.
>
>PS.  A "transform-elements" option would be awful nice too:
>
>         transform-elements: p[class="foo"]/p[class="bar"], ...
>
>Would save writing an awful lot of XSLT stylesheets, scripts, etc. ...
>
>
>/Jelks

--
Sebastian Lange
http://www.sl-chat.de/
Maybe the first chat site that validates as HTML
4.0 even though user input may contain HTML codes.

Courtesy to Dave Raggett's HTML Tidy:
http://www.w3.org/People/Raggett/tidy/

Tidy your documents ONLINE:
http://www.sl-chat.de/Tidy/

Received on Wednesday, 23 August 2000 03:22:43 UTC