From: Zoran V. <zv...@ar...> - 2006-06-14 10:12:30
|
Am 07.06.2006 um 16:29 schrieb Stephen Deasey: > > I don't have time to try and reproduce the problems from scratch, but > I'd be happy to fix it if there's a couple of concise test cases. But I was forced to do so... (customer support)... Well, I have fixed that spurious "timeout waiting for update:.." problems I was experiencing. The problem was rather trivial and could be ranked as "omission" in the Ns_CacheSetValueExpires() code (look for diffs between 1.8 and 1.9 to see the changes). Another problem was: ns_cache_exists which returned true for values that were internally expired. I corrected that as well (see also below). Stephen, there is an *architectural* problem between the Ns_CacheFindEntry() call which internally calls ExpireEntry() and Ns_CacheWaitCreateEntry() which checks the value to non-NULL and waits forever (or for some time, eventually aborting with timeout). The API sequence for that is simple: call Ns_CacheFindEntry() this one will check the entry and eventualy call ExpireEntry() which will unset the entry value but NOT delete the entry itself call Ns_CacheWaitCreateEntry which will find the entry (as it is not deleted) but will wait forever (or timeout) because the entry value is empty and nobody is going to set it any more This is what really happened in my case even before the 1.8 version of the file, but it was just harder to trigger. In 1.8 it is trivial to trigger and our app breaks very early there. Now the question is: how can we fix that? One solution would be to really delete expired entry in Ns_CacheFindEntry() instead of just ExpireEntry(). This would salvage logic in Ns_CacheWaitCreateEntry(). Another solution would be to add new bit in the Entry structure marking the structure as expired. Then add new logic in the Ns_CacheWaitCreateEntry() which would check that bit and act accordingly. Please tell me what do you think. If possible "as soon as possible" as I have some very angry customers chasing me. I'm OK with any of the proposed changes or with any other as well, as long as it fixes the problem. What I COULD NOT verify was any memory-related problems as Vlad is experiencing. I can take the code thru Purify once more but this will take me another two days of work. Cheers, Zoran |