W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2000

RE: eliminate <script> tags

From: Sebastian Lange <lange@cyperfection.de>
Date: Wed, 23 Aug 2000 09:19:35 +0200
Message-Id: <>
To: <html-tidy@w3.org>
In Perl5 [http://www.cpan.org/ports/],

$foo =~ s/<SCRIPT[^>]*>.*?<\/SCRIPT[^>]*>//gsi;
... would strip all script tags (including content),

$foo =~ s/(<SCRIPT[^>]*>).*?(<\/SCRIPT[^>]*>)/$1$2/gsi;
... should remove the content of all script tags but leave the tags untouched.



At 00:50 23.08.2000 -0400, Jelks Cabaniss wrote:
>Vipul Veera wrote:
> > Where do i modify tidy so that the program strips everything between
> > <script> and </script>
>Tidy doesn't do that.  You'll have to do some pre or post-processing
>It might be nice if Tidy had an option like ...
>         delete-elements: marquee, blink
>but it opens up a can of worms.  For example, if you say ...
>         delete-elements: p
>does it delete the *contents* of all the paragraphs, or just "strip the
>tags"?  (In the script example, you wanted the contents removed, but that
>wouldn't necessarily always be the case).  And to make it useful, you'd also
>have to be able to select by attribute ...
>         delete-elements: div[class="foo"], span[class="bar"]
>You can see that adding this option wouldn't quite be a 15-minute hack. :)
>You can get what you want today by just piping Tidy's output to an XSLT
>stylesheet, a Perl script, etc.
>PS.  A "transform-elements" option would be awful nice too:
>         transform-elements: p[class="foo"]/p[class="bar"], ...
>Would save writing an awful lot of XSLT stylesheets, scripts, etc. ...

Sebastian Lange
Maybe the first chat site that validates as HTML
4.0 even though user input may contain HTML codes.

Courtesy to Dave Raggett's HTML Tidy:

Tidy your documents ONLINE:
Received on Wednesday, 23 August 2000 03:22:43 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:48 UTC