Hello, I'm trying to replace the HTML parser that's coded into the swish-e search engine. I've replaced swish's built-in XML parser with James Clark's Expat library -- it was perfect for our needs. So, now I'm looking for something similar to Expat for simple HTML parsing. For swish, we need to extract text in a title, in the body, and in meta tags -- and also know what text is <b> or <em>. Something that is under GPL, quite portable and builds without much work, and easy to embed in an application (as Expat was). Will the HTML parser in www-lib work for me? If so, can anyone point to any examples using the code? I'll be parsing in-memory documents for the most part. Thanks very much, Bill Moseley mailto:moseley@hank.orgReceived on Tuesday, 31 July 2001 00:48:33 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:39 GMT