From: Mantis B. T. <no...@bu...> - 2010-03-10 10:01:46
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=1528 ====================================================================== Reported By: mnalis Assigned To: ====================================================================== Project: bacula Issue ID: 1528 Category: Director Reproducibility: sometimes Severity: major Priority: normal Status: new ====================================================================== Date Submitted: 2010-03-09 12:20 UTC Last Modified: 2010-03-10 10:01 UTC ====================================================================== Summary: director sometimes hangs on "status dir" Description: bacula was running it's nightly batch of jobs, and it looks like it locked somewhere in the middle of the task. First sign of problem was that slow job that was supposed to be canceled by "Max Run Sched Time" in the morning was still running. I've started bconsole (on the same machine that runs director/sd) and issued "s dir" it got as far as: # bconsole Connecting to Director birdun.carnet.hr:9101 1000 OK: birdun-dir Version: 5.0.1 (24 February 2010) Enter a period to cancel a command. *s dir birdun-dir Version: 5.0.1 (24 February 2010) x86_64-pc-linux-gnu debian 5.0.4 Daemon started 08-Mar-10 17:08, 136 Jobs run since started. Heap: heap=1,536,000 smbytes=1,709,430 max_bytes=671,704,832 bufs=3,696 max_bufs=5,850 Scheduled Jobs: Level Type Pri Scheduled Name Volume =================================================================================== and then it hanged there (I've waited for an hour and it didn't change). I've checked mysql, and it wasn't running any queries at the time. Next I've tried to open new bconsole, but it timed out connecting to director (after 5 minutes) with: birdun# date ; bconsole ; date Tue Mar 9 10:19:34 CET 2010 Connecting to Director birdun.carnet.hr:9101 Director authorization problem. Most likely the passwords do not agree. If you are using TLS, there may have been a certificate validation error during the TLS handshake. Please see http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION003760000000000000000 for help. Tue Mar 9 10:24:39 CET 2010 I've tried to start bconsole several more time for another half an hour, always with same error, at which point I've given up and killed director with ABRT. bactrace i traceback files are attached. ====================================================================== ---------------------------------------------------------------------- (0005176) mnalis (reporter) - 2010-03-10 10:01 http://bugs.bacula.org/view.php?id=1528#c5176 ---------------------------------------------------------------------- please use logs_20100309_fix.tgz, the original tgz was wrong. Issue History Date Modified Username Field Change ====================================================================== 2010-03-09 12:20 mnalis New Issue 2010-03-09 12:20 mnalis File Added: logs_20100309.tgz 2010-03-09 12:35 mnalis Issue Monitored: mnalis 2010-03-10 10:01 mnalis File Added: logs_20100309_fix.tgz 2010-03-10 10:01 mnalis Note Added: 0005176 ====================================================================== |