Content type of a response from Oliver King-Smith on 2001-09-28 (www-lib@w3.org from July to September 2001)

From: Oliver King-Smith <oliverks@tescina.com>
Date: Fri, 28 Sep 2001 00:10:19 -0700
To: www-lib@w3.org
Message-Id: <20010928070350.OQMJ24406.femail42.sdc1.sfba.home.com@c940959-b>

I have written a program that crawls links.  I am only interested in
certain types of pages (namely HTML).  I have configured the libwww browser
to route other content types to the HTBlackHoleConverter.

Everything works fine, except I actually download lots of stuff that I
don't want (like .exe and .zip files).  Is there a way to configure libwww
so it stops as soon as it knows the content type coming back in incorrect?

It seems the mime parser and content mapper know this but I can not figure
out how to intercept this and stop this waste of band width.

Any suggestions would be greatly appreciated.

Oliver

Received on Friday, 28 September 2001 03:03:56 UTC