From: Keven T. <byt...@sh...> - 2009-01-15 02:35:52
|
Found the problem. On some BSD's, fgets might be interrupted during read. This is different from sending the program a SIGINT (which is handled elsewhere and has nothing to do with this issue). When fgets is interrupted, it automatically returns NULL and sets errno to EINTR (interrupted). Since the main while loop of sshguard assumes that fgets doesn't return NULL so long as it's working properly and reading from stdin, the loop exits and shuts down sshguard. If sshguard is launched via syslogd, then it gets relaunched until fgets bombs out again, causing sshguard to shutdown. This seems to repeat indefinitely and there *are* cases of this happening on google (if you check the syslog time stamp, you can see sshguard is exiting at random times). I've included a patch that resolves this issue. It should be cross- platform compliant and basically acts as a drop-in replacement fgets wrapper (aptly called "safe_fgets"). It will handle EOF and error conditions properly, ignoring any spurious EINTR issues, since those are non-critical and do NOT impact the operation of the program (not anymore, at least). Since program signaling is handled elsewhere, sshguard will still properly exit and relaunch when logs are rotated via syslog/syslog-ng/ whatever. This simply fixes the issue of sshguard exiting prematurely due to some weird behavior with fgets on other platforms (primarily BSD, it appears). I've confirmed sshguard has been running for more then ~4 hours on my OpenBSD box(s). It is blocking and releasing IP's properly without exiting inbetween. It has exited and relaunched /once/ exactly on the hour when authlog was rotated, but other then that it's 100% stable now. -KT <------SNIP SNIP------> --- ./sshguard.c Wed Jan 14 18:16:09 2009 +++ /root/sshguard.c Wed Jan 14 18:15:35 2009 @@ -89,6 +89,8 @@ static inline void attackerinit(attacker_t *restrict i static void usage(void); /* comparison operator for sorting offenders list */ static int lastAttackComparator(const void *a, const void *b); +/* safe fgets function */ +char *safe_fgets(char *s, int size, FILE *stream); /* handler for termination-related signals */ void sigfin_handler(int signo); /* handler for suspension/resume signals */ @@ -213,7 +215,7 @@ int main(int argc, char *argv[]) { opts.abuse_threshold, (unsigned int)opts.pardon_threshold, (unsigned int)opts.stale_threshold); - while (fgets(buf, MAX_LOGLINE_LEN, stdin) != NULL) { + while (safe_fgets(buf, MAX_LOGLINE_LEN, stdin) != NULL) { if (suspended) continue; retv = parse_line(buf); @@ -233,6 +235,23 @@ int main(int argc, char *argv[]) { exit(0); } +char *safe_fgets(char *s, int size, FILE *stream) { + for (;;) { + if (fgets(s, size, stream)) + return s; + if (feof(stream)) + return NULL; + if(!ferror(stream)) { + sshguard_log(LOG_ERR, "Error reading from stdin: unknown fgets error!"); + exit(0); + } + if(errno != EINTR) { + sshguard_log(LOG_ERR, "Error reading from stdin: fgets returned %s", strerror(errno)); + exit(0); + } + clearerr(stream); + } +} void report_address(attack_t attack) { attacker_t *tmpent = NULL; <------SNIP SNIP------> On Jan 14, 2009, at 2:13 PM, Keven Tipping wrote: > Nope, that's not it. > > Cron is setup to run newsyslog every hour, but SSHguard terminates > randomly "whenever". > > As I've said, for whatever reason fgets is returning NULL and breaking > out of the main loop. The signal handler for sigint/sigfin isn't being > called here as it would be if syslogd or some other program was > sending sshguard a sigint. I'm not sure if this is a bug with fgets, > OpenBSD, or syslogd, but I'll try and do some debugging today and > figure it out. > > -KT > > On Jan 14, 2009, at 12:01 PM, Mij wrote: > >> Hello Keven, >> >> Do you have log rotation? Log rotation causes processes to be >> signaled >> for opening the new log files. This is often the case causing those >> message >> in log files. >> >> >> On Jan 14, 2009, at 11:09 AM, Keven Tipping wrote: >> >>> Okay, I just did some quick debugging here. >>> >>> It appears like SSHguard is exiting the main loop in sshguard.c. For >>> whatever reason, on OpenBSD 4.4, the line: >>> while (fgets(buf, MAX_LOGLINE_LEN, stdin) != NULL) >>> >>> Is indeed returning NULL. This causes the loop to break and exit(0) >>> is >>> called, resulting in the "Got exit signal" message. According to >>> google (and I'm not a programmer), this is caused either by fgets >>> encountering EOF or some other error. >>> >>> Any ideas to as why this is occurring? >>> >>> -KT >>> >>> On Jan 14, 2009, at 2:47 AM, Keven Tipping wrote: >>> >>>> Greetings to all. >>>> >>>> I've been trying to get SSHguard running *reliably* on several >>>> OpenBSD >>>> 4.4 boxes. They all exhibit the same problem. >>>> >>>> I've installed sshguard (both 1.4-rc2 and svn) and have it >>>> currently >>>> running as root (though I doubt this has anything to do with the >>>> problem) via Syslog. The relevant syslog.conf line is: >>>> auth.info;auth.priv |exec /usr/sbin/sshguard >>>> >>>> SSHguard launches as expected when there's authlog traffic, and >>>> works >>>> just fine. I can hammer the box from the LAN and SSHguard adds the >>>> IP >>>> addresses to the pf table. That's all fine and great. >>>> >>>> The problem is, that SSHguard constantly "exits". I'm not sure if >>>> this >>>> is a SSHguard problem or something OpenBSD related, because I can't >>>> find anything in syslog's man page about this and there's nothing >>>> in >>>> my crontabs that would otherwise interfere with SSHguard. >>>> >>>> What happens is that every ~5-20 minutes (it seems completely >>>> random?), SSHguard prints the following in authlog: >>>> "Jan 14 02:33:23 gw sshguard[28260]: Releasing 10.0.1.140 after 488 >>>> seconds." >>>> "Jan 14 02:33:23 gw sshguard[28260]: Got exit signal, flushing >>>> blocked >>>> addresses and exiting..." >>>> >>>> 10.0.1.140 is one of /several/ systems I used to test SSHguard- >>>> there >>>> were about ~10 IP's in the blocklist in this case, the latest one >>>> was >>>> blocked/added at 02:33:07, only ~16 seconds before SSHguard once >>>> again >>>> exited for no apparent reason. Obviously, when SSHguard exited, the >>>> entire table was flushed. There's no way the last IP that was >>>> blocked >>>> had exceeded 420 seconds prior to SSHguard "getting an exit >>>> signal". >>>> >>>> I'm not sure why it does this. Once SSHguard cleanly exits (due to >>>> the >>>> above "signal"), syslogd restarts it as soon as there's authlog >>>> traffic again and SSHguard runs anywhere from 5-20 minutes before >>>> exiting. Rinse, repeat. It will do this all day, basically. >>>> >>>> I have no idea if this is by design, or what is going on here. Any >>>> ideas? >>>> >>>> Cheers, >>>> -KT >>>> >>>> ------------------------------------------------------------------------------ >>>> This SF.net email is sponsored by: >>>> SourcForge Community >>>> SourceForge wants to tell your story. >>>> http://p.sf.net/sfu/sf-spreadtheword >>>> _______________________________________________ >>>> Sshguard-users mailing list >>>> Ssh...@li... >>>> https://lists.sourceforge.net/lists/listinfo/sshguard-users >>> >>> >>> ------------------------------------------------------------------------------ >>> This SF.net email is sponsored by: >>> SourcForge Community >>> SourceForge wants to tell your story. >>> http://p.sf.net/sfu/sf-spreadtheword >>> _______________________________________________ >>> Sshguard-users mailing list >>> Ssh...@li... >>> https://lists.sourceforge.net/lists/listinfo/sshguard-users >> >> >> ------------------------------------------------------------------------------ >> This SF.net email is sponsored by: >> SourcForge Community >> SourceForge wants to tell your story. >> http://p.sf.net/sfu/sf-spreadtheword >> _______________________________________________ >> Sshguard-users mailing list >> Ssh...@li... >> https://lists.sourceforge.net/lists/listinfo/sshguard-users > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by: > SourcForge Community > SourceForge wants to tell your story. > http://p.sf.net/sfu/sf-spreadtheword > _______________________________________________ > Sshguard-users mailing list > Ssh...@li... > https://lists.sourceforge.net/lists/listinfo/sshguard-users |