Re: HTML Strippers

rmesa@best.com (Robert A. Mesa) wrote:

> Is there a utility to strip away HTML tags. Yes I know, WHY? I've been task
> to do such a thing at work. Any info would be greatly appreciated.

sgmls and sgmlsasp with an empty replacement file 
will do the trick:

	sgmls html.decl YourFile.html | sgmlsasp /dev/null > YourFile.txt

This assumes that YourFile.html is valid HTML, of course...

The output will be the text portions of YourFile.html,
with references expanded and all other markup removed. 

If you're on a DOS system, substitute any empty file for /dev/null;
I don't know about other systems.


--Joe English

  jenglish@crl.com

Received on Wednesday, 26 April 1995 13:55:57 UTC