[Prev][Next][Index][Thread]

RE: Cache Question



Hi!

>     I have recently started to incorporate the caching mechanism 
> provided by W3lib and am facing problems. In my code, I created new 
> request for each in-line image but these don't seem to terminate 
> (even though I've verified the images are completely downloaded) when
> I enable caching.
> 
>     After tracing through the w3lib code, I realise that the status 
> code returned by the MIME stream to the Tee stream could be the 
> reason. The Tee stream would only return a HT_OK if both of its 
> streams return HT_OK or a HT_ERROR if both of its stream return 
> HT_ERROR. HT_WOULDBLOCK is returned otherwise.
> 
>     In my case, the MIME stream returns a HT_LOADED while the CACHE 
> stream returns a HT_OK. This cause the Tee stream to return a 
> HT_WOULDBLOCK. I'm not clear why the MIME should return a HT_LOADED 
> at times intead of a generic HT_OK. Any pointers on how I can remedy 
> the situation?

I had a similar problem just a couple of days ago. Indeed, it's the code in HTTee.c 
which is making all the trouble. The rule to compute the return value from two streams 
is (for me) very strange. The same rule is used in almost all HTTee_ functions, here I 
give HTTee_put_character as an example:

	PRIVATE int HTTee_put_character (HTStream * me, char c)
	{
	    int ret1 = (*me->s1->isa->put_character)(me->s1, c);
	    int ret2 = (*me->s2->isa->put_character)(me->s2, c);
	    return (!(ret1+ret2) ? HT_OK :
            	(ret1==HT_ERROR || ret2==HT_ERROR) ? HT_ERROR :
	            HT_WOULD_BLOCK);
	}

If we use the cache, the HTTee stream is used to pass data to MIME and 
Cache streams. MIME stream returns HT_LOADED after all the data has been
loaded. In HTUtils.h HT_LOADED is defined as a success code. Cache stream
returns HT_OK, which obviously also signifies the success. According to the rule
given above, these two success codes merge into HT_WOULD_BLOCK, which is
defined as a failure code!

HT_WOULD_BLOCK is actually not a "real" failure, but also not a success. After 
receiving such a code functions which put the data into the stream stack enter the 
waiting state. They are waiting for some event to put them out of that state, however 
nothing comes, since all data has been already transmitted.

The problem doesn't occur if we don't use the cache, because in that case HTTee is
not used.

The quick (but dirty) remedy is to change the rule in HTTee_ functions. I've introduced a 
helper function:

	int merge_results (int ret1, int ret2)
	{
	    if (ret1 == HT_OK)
	        return ret2;
	    else if (ret2 == HT_OK)
	        return ret1;
	    else if (ret1 == HT_ERROR || ret2 == HT_ERROR)
	        return HT_ERROR;
	    else
	        return ret1;
	}

I call this helper function instead of awful code computing the return value in HTTee_ 
functions:

	PRIVATE int HTTee_put_character (HTStream * me, char c)
	{
	    int ret1 = (*me->s1->isa->put_character)(me->s1, c);
	    int ret2 = (*me->s2->isa->put_character)(me->s2, c);
	    return merge_results(ret1,ret2);
	}

The merge_results functions treats streams asymmetrically. It assumes that the
ret1 is more important than ret2 (see the last line: return ret1). This is indeed the case
of the MIME/Cache pair, where writing to cache is obviously less important than 
presenting the document to the user.

In this particular situation my rule works perfectly, however it's tricky and perhaps not 
suitable in some cases. I think the problem deserves a deeper study (and discussion). 
Frankly speaking, I can't imagine any reasonable general-purpose rule for merging 
results from two streams. Perhaps the whole mechanism of passing result codes 
should be changed? To what? Or simply rethinking and changing the set of result 
codes would solve the problem?

Does anybody have any idea?

Thanks

Maciej Puzio
puzio@laser.mimuw.edu.pl