libwww: pending requests reference freed DNS entries

> Once the host name has been resolved into an IP-address, it
> is stored in the cache.  The entry stays in the cache until
> either an error occurs when connecting to the remote host or
> it is removed during garbage collection.


The WWW library's HTDNS.c module has a problem.


In the HTDNS.c HTGetHostByName() function, the DNS cache is searched
for the hostname.   If the name is found, but is aged longer
than the timeout period, it is deleted and a new gethostbyname()
call is issued later:


    /* Search the cache */
    {
        HTList *cur = list;
        while ((pres = (HTdns *) HTList_nextObject(cur))) {
            if (!strcmp(pres->hostname, host)) {
                if (time(NULL) > pres->ntime + DNSTimeout) {
                    if (PROT_TRACE)
                        HTTrace("HostByName.. Refreshing cache\n");
                    delete_object(list, pres);
                    pres = NULL;
                }
                break;
            }
        }
    }



HOWEVER, the DNS instance may still be in use.   The DNS instance
is freed by the delete_object() call and the heap storage may 
subsequently be used by some other portion of the application.
Later, when the deferred request (whose net->dns refers to the
freed entry) is serviced, the net->dns information is garbage.


This bug will probably manifest itself in real use by 
a very intermittent core dump or strange app behavior
after running the program  > 12 hours (the default DNS timeout
value is 12 hours).  Every 12 hours the DNS entry may be
flushed, and if the app is busy servicing requests right then, 
the problem *may* occur.   We only tracked this down because
we set the DNS timeout to zero while looking at a completely
different problem, and then noticed we were getting core dumps
in the web code on complex VRML models (we are working on a VRML
viewer).


This problem is observable if you set the DNSTimeout to 
zero or less and then try to load a URL with lots of anchors.
The pending requests get queued up with dangling references
to the dns entry.  If the dns entry is freed and gets reused, 
then the dns layer gets very confused.



Two possible solutions occur to me.

1) use a reference counting scheme, where the DNS entry is
   only deleted after the timeout *AND* when the reference 
   count goes to zero.

2) don't free the storage for the DNS entry in case there are
   any outstanding references to the entry.


We will probably hack in the fix #2, and suffer an undesirable
memory leak in preference to a core dump.

The proper solution is likely to be solution #1.  It is not clear
to me just how to implement it, however.   Should the DNS layer
search the net layer for references before deleting the DNS entry?



Thanks for taking a look at this corner case problem!

Jim Andreas
Hewlett-Packard
andreas@cv.hp.com

Received on Tuesday, 2 July 1996 18:21:13 UTC