From: James P. <jam...@ne...> - 2005-12-29 18:16:24
|
I have been updating with each of the new 1.38.3's when they come out, but I am still having an issue with "waiting to reserve a device". The first 1.38.3 update allowed it to perform backups, but it now waits exactly 30 minutes before starting. When the job is scheduled to start, it immediately issues a "waiting to reserve a device" email... 30 minutes later it starts backing up. It seems like if I restart bacula, the first scheduled job actually runs normally (ie right when scheduled), but then all subsequent jobs have a 30 minute delay. Any ideas? I will make a log with full debugging and see if that yields any clues. james Kern Sibbald wrote: > On Friday 16 December 2005 10:02, Volker Dierks wrote: > >> Hello Kern, >> >> do you think that this problem also affects me? My plan was to test >> the beta (released 10. Dec) and two drives today with a new tape set. >> > Quite posibbly -- try the 14 Dec 05 version instead ... > >> To give you a little reminder: >> A HP 2/20 Library with 10 tapes on the left side in pool DRIVE-1 and >> 10 tapes on the right side in pool DRIVE-2. I'm allways loading the >> first tape from any pool and mount it. So there's no mtx stuff at the >> beginning of the backup, but definitely later when tapes got full. >> >> Are you going to release the fixed version in the next few hours? >> > > I am going to release the second BETA 1.38.3 now. It has a number of fixes. > I've had so many things going on that I don't remember the context of your > problems, but I would *strongly* suggest that anyone having reservation or > job hanging problems with 1.38.2 or the first 1.38.3 BETA should try the > second version (14 December 2005). > > >> Thanks, >> Volker >> >> Kern Sibbald wrote: >> >>> Hello Rick, >>> >>> Thanks for the debug output. I think I have now found the problem in the >>> algorithm at least the problem that is hitting you. This time, I'm 100% >>> that I have found at least one major problem. >>> >>> I'm going to run code through all my tests here on two machines, then on >>> Solaris and FreeBSD. Once I've done that I'll make the new code >>> available -- probably this evening. >>> >>> On Thursday 15 December 2005 06:31, Rick Knight wrote: >>> >>>> Kern Sibbald wrote: >>>> >>>>> On Wednesday 14 December 2005 04:22, Rick Knight wrote: >>>>> >>>>>> Kern Sibbald wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> If you are able to reproduce this easily, could you turn on level 100 >>>>>>> by putting -d100 on the command line when you start it, then capture >>>>>>> the output. This may help me understand what is going on. >>>>>>> >>>>>>> I've tried everything I can to duplicate this, but all my tests run >>>>>>> fine. >>>>>>> >>>>>>> Hmmm. Normally, it wouldn't be the OS that is causing problems, but >>>>>>> I'm open to almost any suggestion -- the goal being to fix it ... >>>>>>> >>>>>>> On Tuesday 13 December 2005 22:07, James Peverill wrote: >>>>>>> >>>>> I suspect that there are two problems here. 1. You probably don't have >>>>> Maximume Concurrent Jobs set in your director's storage resource, and 2. >>>>> it looks like there may be a problem with the way the SD in 1.38 is >>>>> trying to open drives, which cause it to wait. I'm working on a >>>>> solution to that now. >>>>> >>>>> >>>>>> Thanks, >>>>>> Rick Knight >>>>>> >>>> Kern, >>>> >>>> Adding the Max Concurrent Jobs = 4 didn't mak any difference. I added >>>> OPTIONS='-d100' to the bacula startup script, started bacula and ran >>>> thru all the jobs, capturing all of the output. Log file attached. The >>>> only thing that looks obvious to me are the python errors or messages. I >>>> may rebuild bacula without python support this weekend and see if it >>>> helps. I did not have this problem with 1.38.1. >>>> >>>> Thanks again, >>>> Rick Knight >>>> > > |