Re: [Noffle-devel] [ noffle-Bugs-624357 ] SEGV during concurrent
Brought to you by:
bears
From: Jim H. <jim...@ac...> - 2002-10-24 13:28:38
|
On 21-Oct-2002 Mirko Liss wrote: > no...@so... wrote: >> Bugs item #624357, was opened at 2002-10-16 16:14 > >> Initial Comment: >> My noffle 1.1.2 crashed during fetch. This happened >> crashed. I could not find any core dump lying around. > It should be at /var/spool/noffle/core. Maybe the creation > of core files has been switched off by a ulimit command. > >> This is not the only problem I had with concurrent >> access. I am using an automated script to archive >> news.answers using NNTP STAT followed by a NEXT loop >> until it gets a 223. Several times I read news while >> the script was running and it bailed out because it >> noticed the article ID had been reset to the first >> article in the group instead of being increased by NEXT >> to the next available article. > > I added some additional Log_inf() calls to src/server.c:doNext(), > Server_flushCache(), changeToGrp() and src/lock.c:lockSignal(). > Then I used two server processes to loop through the articles > of different groups in the aforementioned way. > Here's what I got (sorry for the long lines): > [...] > I suppose the call of LoadGrpIfSelected() in src/server.c:doNext() > resets server.artPtr via changeToGrp(), because src/lock.c:lockSignal() > sets server.groupReady=FALSE via Server_flushCache(). > > I'm not sure how to get rid of that bug. Noffle's locking mechanism > still looks too weird for me to understand fully. Thanks for looking at that, Mirko. Yeah, the locking stuff is a Big of a Bugger. The SIGUSR1 is another Noffle process indicating that it wants access to the data files but is currently locked out. Next time the receiving process goes through a 'release lock', it actually releases the lock. Usually it will actually hang on to the lock, so that a subsequent 'get lock' is a no-op. Releasing and getting the lock is an expensive thing to do. While the lock is released, the other Noffle process can be doing all kinds of things to the databases, so all in-memory structures must be discarded. So here there's a lock release/lock regain happening where the current group cursor is not being retained. That's a bug, no question. Now, thinking about it, it must be a problem when the lock is released in the server code. Hmm. There must be something that remembers and restores the current group, but not the current cursor. Mirko, can you do me a favour? I'm flat out with work at the moment (off to the States at the weekend to present two weeks of training courses) and haven't got time to test this. But I think it might work. The changeToGrp() routine was being used to reload the in-memory data, and was resetting the article pointer. This patch attempts to separate the 'reload the current group' functionality. Thanks. (BTW, is it just me or are we back to a 3 day delay receiving mails from the SourceForge mailing list? I just got yours now.) --- server.c 5 Aug 2002 22:05:02 -0000 +++ server.c 24 Oct 2002 13:23:06 -0000 @@ -314,15 +314,34 @@ } static void -changeToGrp( const char *grp ) +loadGrpInfo( const char *grp ) { checkNewArts( grp ); Utl_cpyStr( server.grp, grp ); readCont( grp ); - server.artPtr = Cont_first(); + + /* + * This routine is used to change back to a group after releasing + * the lock. We need to preserve the group cursor if at all possible. + * So, if the article pointer points to an article before or after + * the current range, adjust it to the first/last article. Otherwise + * leave it well alone. + */ + if ( server.artPtr < Cont_first() ) + server.artPtr = Cont_first(); + else if ( server.artPtr > Cont_last() ) + server.artPtr = Cont_last(); + server.groupReady = TRUE; } +static void +changeToGrp( const char *grp ) +{ + loadGrpInfo( grp ); + server.artPtr = Cont_first(); +} + static Bool doGrp( char *arg, const Cmd *cmd ) { @@ -370,7 +389,7 @@ if ( ! server.groupReady ) { Utl_cpyStr( group, server.grp ); - changeToGrp( group ); + loadGrpInfo( group ); } return TRUE; } -- Jim Hague - jim...@in... (Work), ji...@be... (Play) Never trust a computer you can't lift. |