- From: Joe English <jenglish@crl.com>
- Date: Wed, 26 Apr 1995 10:53:56 -0700
- To: Multiple recipients of list <www-html@www10.w3.org>
rmesa@best.com (Robert A. Mesa) wrote: > Is there a utility to strip away HTML tags. Yes I know, WHY? I've been task > to do such a thing at work. Any info would be greatly appreciated. sgmls and sgmlsasp with an empty replacement file will do the trick: sgmls html.decl YourFile.html | sgmlsasp /dev/null > YourFile.txt This assumes that YourFile.html is valid HTML, of course... The output will be the text portions of YourFile.html, with references expanded and all other markup removed. If you're on a DOS system, substitute any empty file for /dev/null; I don't know about other systems. --Joe English jenglish@crl.com
Received on Wednesday, 26 April 1995 13:55:57 UTC