From: <da...@ha...> - 2014-07-17 10:59:39
|
On 2014-07-17T05:51:53 CEST, Alex wrote: > I did some tests this evening by basically disconnecting the server > with the master mysql database, and it caused all mail on the two > remaining systems that were still running to bounce with the "4.3.5 > Server configuration problem". If you made the configuration change on all your hosts, i dont know what you are experiencing and your mail contains no new information, technical or otherwise, to go on. And that, paired with the fact that im fairly certain how this works and can see in my tests that it is indeed working as expected, simply makes me unable to come up with guesses as to whats troubling your system. What i CAN do, is show you how to test better, to pinpoint where the issue may lie. The way i tested this manually, was by simply "telnetting" to the postgrey service and talking to it. That may be a bit cumbersome, so fortunately Michael Ludvig has included a testscript in the tar-ball, simply called "tester.pl". On my system, a normal run looks like this: ---- $ ./tester.pl --client-ip 10.0.0.1 action=451 Greylisted for 5 minutes (16) ---- By adding "time" to the beginning of the command, we can see how much time it took to complete. So heres a run where mysql-server has downed its interface just for 10 seconds: ---- $ time ./tester.pl --client-ip 10.0.0.1 action=dunno real 0m3.062s user 0m0.056s sys 0m0.004s ---- "action=dunno" means sqlgrey passes no judgment. Which in turn means "let it through". This "conclusion" is reached within 3 seconds (you can see that at the line "real 0m3.062s"). And this is an example of sqlgrey not running ---- $ time ./tester.pl --client-ip 10.0.0.1 Connect failed: IO::Socket::INET: connect: Connection refused ---- Finding out how long postfix will wait is as simple as: ---- $ postconf smtpd_policy_service_timeout smtpd_policy_service_timeout = 100s ---- In this case 100s. When i point a my sqlgrey to a server behind a packet-dropping-firewall and rerun the test ---- $ time ./tester.pl --client-ip 10.0.0.1 ---- i literally had to ctl-c manually after ~6 minutes. Which is way more than 100s, of course. So THAT would result in "Server configuration problem". Another thing that could give "Server configuration problem", would be if any garbage output (ie. an internal error from sqlgrey) was to be printed out to the socket. But even that would be visible by testing like this. As the predominant theory (and the only theory with a positive test so far) is the timeout theory, I think you'd have to to try running this command while you're experiencing the problem. This should help to either prove or disprove that its a timeout problem and may even catch any garbage output if that was the case.. - Dan |