Only fetch certain formats from Paul Law on 1996-05-22 (www-lib@w3.org from April to June 1996)

From: Paul Law <paul-law@ix.netcom.com>
Date: Wed, 22 May 1996 11:46:29 +0000
To: www-lib@w3.org
Cc: paul-law@ix.netcom.com
Message-Id: <31A2FE95.304D@ix.netcom.com>

Hello all,

I'm writing a client that only needs to fetch text or html data
but I'm not sure what's the best way to screen out all the other
formats I don't need. Here's my first cut at it:

1. For every URL the client gets, examine the file extension.
   If the file extension is recognized as a format the client
   doesn't want, ignore it.
2. If the URL passes the 1st test, go ahead and fetch the HEAD.
3. Look at the Format info inside the anchor when the HEAD is
   loaded. If the format is not the ones the client wants, stop.
4. Go ahead and GET the doc.

I'm looking for an algorithm that will work smoothly on FTP
and directory listings as well.

Am I missing anything here? Is there a better way to do it?
Thanks for any suggestions.

Paul Law
paul-law@ix.netcom.com

Received on Wednesday, 22 May 1996 14:44:42 UTC