Content type of a response

I have written a program that crawls links.  I am only interested in
certain types of pages (namely HTML).  I have configured the libwww browser
to route other content types to the HTBlackHoleConverter.

Everything works fine, except I actually download lots of stuff that I
don't want (like .exe and .zip files).  Is there a way to configure libwww
so it stops as soon as it knows the content type coming back in incorrect?

It seems the mime parser and content mapper know this but I can not figure
out how to intercept this and stop this waste of band width.

Any suggestions would be greatly appreciated.

Oliver

Received on Friday, 28 September 2001 03:03:56 UTC