- From: Bill Moseley <moseley@hank.org>
- Date: Mon, 30 Jul 2001 21:48:27 -0700
- To: www-lib@w3.org
Hello, I'm trying to replace the HTML parser that's coded into the swish-e search engine. I've replaced swish's built-in XML parser with James Clark's Expat library -- it was perfect for our needs. So, now I'm looking for something similar to Expat for simple HTML parsing. For swish, we need to extract text in a title, in the body, and in meta tags -- and also know what text is <b> or <em>. Something that is under GPL, quite portable and builds without much work, and easy to embed in an application (as Expat was). Will the HTML parser in www-lib work for me? If so, can anyone point to any examples using the code? I'll be parsing in-memory documents for the most part. Thanks very much, Bill Moseley mailto:moseley@hank.org
Received on Tuesday, 31 July 2001 00:48:33 UTC