broken DNS lookups by many web browsers?

It looks like lots of web browsers indefinitely cache the IP addresses of
servers they contact and don't pay any attention to the ttls.  Shouldn't
this be the job of a full-featured local caching DNS server?  It's currently
compromising the load balancing and service replication abilities of 
the "lbnamed" utility, which hands out short ttl (we're currently using
10 secs) DNS records from a pool of servers based on their availability 
and load.  Docs on it can be found at

http://www-leland.stanford.edu/~schemers/docs/lbnamed/lbnamed.html

(which also happens to be the server that we're load balancing currently)

Clients are getting the IP for one server in our pool (of 2) and then just 
keep using it exclusively.  If a machine goes down, or the load goes too high,
or if it crashes, we stop handing the IP address out and within 5 minutes, 
no clients should be contacting it at all, but instead clients keep using
their cached IP and either wait on the loaded server or fail with "server
not responding..." even if the other one(s) in the pool are fine.

The libwww2 implementation caches the most recent mapping, Netscape appears 
to cache around 8 mappings (all versions, and unaffected by any of the 
reload/flush cache options), and the libwww4 implementation is somewhat 
better in that it'll at least issue the DNS query again if it fails to 
connect to its cached IP.  That makes the replication work but still 
compromises the load balancing.

Is this actually incorrect behavior by the clients?  If caching IPs is too
big a win for clients to give up, can we at least make them sensitive to
ttls?  Or at the very least do what libwww4 does and query again on failure?

thanks,
Jeff Lewis
Distributed Computing Group
Stanford University

Received on Thursday, 7 March 1996 18:15:36 UTC