From: Mantis B. T. <no...@bu...> - 2012-09-08 12:26:06
|
The following issue has been CLOSED ====================================================================== http://bugs.bacula.org/view.php?id=1872 ====================================================================== Reported By: RoyK Assigned To: kern ====================================================================== Project: bacula Issue ID: 1872 Category: Director Reproducibility: always Severity: major Priority: normal Status: closed Resolution: fixed Fixed in Version: ====================================================================== Date Submitted: 2012-05-23 12:14 BST Last Modified: 2012-09-08 13:25 BST ====================================================================== Summary: Connections to Director from bat/bconsole fails after a while Description: Starting Bacula 5.2.6, server running PostgreSQL and FD, but not SD in service locally, browsing with Bat works for a while, clicking on a few jobs in Jobs Run, and restarting them (they haven't run for a while during the upgrade), this works the first and perhaps second time, but around third or forth time, Bat hangs after I click Run again. After this, bconsole can't connect to Director either. The jobs started seem to continue running, or so the debug log tells me Steps to Reproduce: Start Bat Click a job and choose run again Click run Repeat a few times When it hangs, try to connect from bconsole to verify the hang Additional Information: See attached debug file. I beleive the error occurs around line 967 ====================================================================== ---------------------------------------------------------------------- (0006324) RoyK (reporter) - 2012-05-23 19:06 http://bugs.bacula.org/view.php?id=1872#c6324 ---------------------------------------------------------------------- This seems to happen at the third "run again" when I try a few more times. >From the client side, this looks like this [...] bat: bsock.c:216-0 Current host[ipv4:127.0.0.1:9101] All host[ipv4:127.0.0.1:9101] bat: bsock.c:152-0 who=Director daemon host=localhost port=9101 bat: cram-md5.c:131-0 cram-get received: auth cram-md5 <128...@ba...-dir> ssl=0 bat: cram-md5.c:150-0 sending resp to challenge: ngtu9/+UPVFD89/DIA/jGB bat: cram-md5.c:79-0 send: auth cram-md5 <874579632.1337795902@bat> ssl=0 bat: cram-md5.c:98-0 Authenticate OK +XBPnH/9/+sUM6/O2VhywB bat: dircomm_auth.cpp:147-0 >dird: 1000 OK auth bat: dircomm_auth.cpp:157-0 <dird: 1000 OK: bacula.nilu.no-dir Version: 5.2.6 (21 February 2012) bat: bsock.c:216-0 Current host[ipv4:127.0.0.1:9101] All host[ipv4:127.0.0.1:9101] bat: bsock.c:152-0 who=Director daemon host=localhost port=9101 at which point it hangs until bat is killed. All other jobs, like the backups, run as usual, but neither bconsole nor bat can communicate with the director until the initial bat is killed. ---------------------------------------------------------------------- (0006329) kern (administrator) - 2012-05-24 08:02 http://bugs.bacula.org/view.php?id=1872#c6329 ---------------------------------------------------------------------- Unfortunately the debug trace does not show anything abnormal, on the contrary everything looks OK. The most likely cause is that you have not configured your Director to permit sufficient console connections. If you really think Bacula is hung, then provide some evidence that it is not running, and you will need to get a traceback. If you do not know how, please read the Kaboom chapter of the manual. ---------------------------------------------------------------------- (0006338) RoyK (reporter) - 2012-05-24 12:07 http://bugs.bacula.org/view.php?id=1872#c6338 ---------------------------------------------------------------------- hm… ok… this worked well with 5.0.3, but has changed after the upgrade. To clarify, Bacula Director is not hung, all other jobs/threads seem to run, but when this happens, Bat hangs and new bconsole sessions cannot connect until the initial Bat is killed. A few questions: 1. How would a single Bat session start multiple console connections? 2. I looked in http://bacula.org/5.2.x-manuals/en/main/main/Console_Configuration.html, and couldn't find anything on maximum connections. Where can I set this? ---------------------------------------------------------------------- (0006344) kern (administrator) - 2012-05-24 17:42 http://bugs.bacula.org/view.php?id=1872#c6344 ---------------------------------------------------------------------- A few questions: 1. How would a single Bat session start multiple console connections? It is smarter and makes different connections for different tabs. 2. I looked in http://bacula.org/5.2.x-manuals/en/main/main/Console_Configuration.html, [^] and couldn't find anything on maximum connections. Where can I set this? It is trying to connect to the Director, so you need to increase the number of allowed Consoles in the Director. I don't remember the exact name, you can search the manual or ask on the users list (something like: Maximum Console Connections). What you have reported indicates pretty clearly that it is a configuration issue, so I am closing this bug. ---------------------------------------------------------------------- (0006346) RoyK (reporter) - 2012-05-24 17:49 http://bugs.bacula.org/view.php?id=1872#c6346 ---------------------------------------------------------------------- Neither the sample config nor the documentation at http://bacula.org/5.2.x-manuals/en/main/main/Console_Configuration.html include such a setting, and the configuration was taken from 5.0.x, where this worked well. Before this bug is closed, it should at least be documented how to fix it! ---------------------------------------------------------------------- (0006360) kjetilho (reporter) - 2012-06-04 13:57 http://bugs.bacula.org/view.php?id=1872#c6360 ---------------------------------------------------------------------- RoyK, this is in the director configuration, in the director resource. This URL will take you there directly: http://bacula.org/5.2.x-manuals/en/main/main/Configuring_Director.html#7261 ---------------------------------------------------------------------- (0006373) kern (administrator) - 2012-06-09 10:52 http://bugs.bacula.org/view.php?id=1872#c6373 ---------------------------------------------------------------------- Kjetil, Thanks for the exact reference. ---------------------------------------------------------------------- (0006393) RoyK (reporter) - 2012-06-12 09:31 http://bugs.bacula.org/view.php?id=1872#c6393 ---------------------------------------------------------------------- Thank you for this. Just one more question: I got a note from Kjetil off this tracker that the default limit was 20 connections. Why would that stop the _third_ connection from a Bat client? Also, shouldn't Bat clean up its connections to the director once the job was started? roy ---------------------------------------------------------------------- (0006480) kern (administrator) - 2012-09-08 13:25 http://bugs.bacula.org/view.php?id=1872#c6480 ---------------------------------------------------------------------- I believe that this problem has been fixed already, so I am closing the bug report. Please read the release notes and you should be able to find out what version. Issue History Date Modified Username Field Change ====================================================================== 2012-05-23 12:14 RoyK New Issue 2012-05-23 12:14 RoyK File Added: bacula-dir.debug.gz 2012-05-23 19:06 RoyK Note Added: 0006324 2012-05-24 08:02 kern Note Added: 0006329 2012-05-24 08:02 kern Status new => feedback 2012-05-24 12:07 RoyK Note Added: 0006338 2012-05-24 12:07 RoyK Status feedback => new 2012-05-24 17:42 kern Note Added: 0006344 2012-05-24 17:42 kern Assigned To => kern 2012-05-24 17:42 kern Status new => closed 2012-05-24 17:42 kern Resolution open => no change required 2012-05-24 17:49 RoyK Note Added: 0006346 2012-05-24 17:49 RoyK Status closed => feedback 2012-05-24 17:49 RoyK Resolution no change required => reopened 2012-06-04 13:57 kjetilho Note Added: 0006360 2012-06-09 10:52 kern Note Added: 0006373 2012-06-09 10:52 kern Status feedback => closed 2012-06-09 10:52 kern Resolution reopened => no change required 2012-06-12 09:31 RoyK Note Added: 0006393 2012-06-12 09:31 RoyK Status closed => feedback 2012-06-12 09:31 RoyK Resolution no change required => reopened 2012-06-17 18:12 kern Status feedback => confirmed 2012-09-08 13:25 kern Note Added: 0006480 2012-09-08 13:25 kern Status confirmed => closed 2012-09-08 13:25 kern Resolution reopened => fixed ====================================================================== |