W3C home > Mailing lists > Public > www-lib@w3.org > July to September 2002

Anchor deletion problems and a possible solution

From: Christophe Pierret <christophe.pierret@businessobjects.com>
Date: Wed, 10 Jul 2002 12:46:41 +0200
Message-ID: <DDF8255AC466D311A14B0008C7EBBCCB04BE977C@exch-fra-lv03.businessobjects.com>
To: "'www-lib@w3.org'" <www-lib@w3.org>

Issue:
========
With current implementation, some objects in libwww are never released.
It includes: HTHost, HTParentAnchor, HTChildAnchor.
It may be a problem for a server that uses libwww to retrieve streams
through HTTP. Since memory for those objects is never reclaimed, memory
usage may grow without limit and with no relation with current server load. 

About anchors:
==============
Calling current implementation of HTAnchor_delete() is not an option,
because it does not handle multiple references to the same anchor added
through HTLink_add() (multiple deletions problem).
For exemple, link creation occurs upon a HTTP redirection.

I would like your comments on the following solution (ie: did I miss
something ?)

I have written an implementation of HTAnchor_delete() that uses a garbage
collection mechanism.
The principle is quite simple:
- HTAnchor_delete() marks an anchor as obsolete
- once every X calls to HTAnchor_delete() a garbage collection mechanism is
used to reclaim unused anchors.

The garbage collections algorithm is as follow:
 - mark each anchor as not being used
 - for each parent anchor that is not obsolete, mark all links and children
as used, recursively
 - assemble a list of non-used anchors
 - delete anchors in non-used list 

Using this, I have added a call to HTAnchor_delete() in HTRedirectFilter()
to mark any redirected anchor as obsolete.  This redirected anchor will be
garbaged only when source anchor is garbaged.
In the final request termination handler, I HTAnchor_delete() the source
anchors when I know there is no more launched request pointing to them (and
therefore, redirected anchors will also be deleted)

This mechanism guarantees the elimination of unused anchors at the cost of
managing references to anchors from HTRequest you launch.

About hosts:
============
I still do not have a solution for avoiding too many HTHost.
And since I used a patch that creates a new host for each request (in order
to allow multiple simultaneous request using the same HTTP proxy server), I
have to find one ...

Does anyone have already found a solution for HTHost pseudo-memory-leak ?

Christophe Pierret

Note: I can provide the HTAnchor_delete() code for inclusion in libwww
      I have updates for HTAnchor.c, HTAncMan.h, and HTFilter.c, the whole
new code can be deactivated by defining a macro, falling back to old
behaviour.
Received on Wednesday, 10 July 2002 06:46:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:42 GMT