From: Andrey C. <a...@52...> - 2014-01-23 12:14:48
|
Thank you! > Hi, > > if you want to work around the job terminates after 6 days issue, > there are two places in the code that you need to change. In the > 5.2.13 code base, the first one is in bnet.c line 784 and then line 79 > in bsock.c - both of these source files are in the lib directory. > > hope this helps, > > > --tom > > >> Hi. Thank you, I'll try. >> But why my job terminates exactly after 6 days? >> >>> Hello, >>> >>> Your problem is a comm line drop not a watch dog problem. >>> >>> Put HeartBeatInterval = 300 in your Dir, SD, and FDs. >>> >>> Best regards, >>> Kern >>> >>> On 01/15/2014 09:28 AM, Andrey Chebotarev wrote: >>>> I asked because in the latest version(5.2.13) modifying sources >>>> doesn't >>>> work anymore. >>>> I've changed this part: >>>> /* >>>> * ****FIXME**** reduce this to a few hours once >>>> * heartbeats are implemented >>>> */ >>>> bsock->timeout = 60 * 60 * 30 * 24; >>>> >>>> but job still terminates after 6 days :( >>>> >>>> In 5.2.11 I didn't have such problem. >>>> What has been changed in 5.2.13 ? In which part of code I can fix it? >>>> >>>>> Hi. >>>>> I'm using bacula to backup huge stuff, about 100TB. Usually it takes >>>>> about 15-16 days. >>>>> I've faced with a problem. As I understood, in bacula there is >>>>> mechanism >>>>> which cares about jobs(watchdog timer). And with this mechanism I >>>>> have >>>>> trouble. My job terminates after 6 days with error message: >>>>> >>>>> 2013-12-29 16:42:56baculasrv-dir JobId 8013: Error: Watchdog sending >>>>> kill after 518427 secs to thread stalled reading File daemon. >>>>> 2013-12-29 16:42:56baculasrv-dir JobId 8013: Fatal error: Network >>>>> error >>>>> with FD during Backup: ERR=Interrupted system call >>>>> 2013-12-29 16:42:57baculasrv-sd JobId 8013: Elapsed time=143:47:09, >>>>> Transfer rate=58.09 M Bytes/second >>>>> 2013-12-29 16:42:57baculasrv-dir JobId 8013: Error: Director's >>>>> comm line >>>>> to SD dropped. >>>>> 2013-12-29 16:42:57baculasrv-dir JobId 8013: Fatal error: No Job >>>>> status >>>>> returned from FD. >>>>> 2013-12-29 16:42:57baculasrv-dir JobId 8013: Error: Bacula >>>>> baculasrv-dir >>>>> 5.2.13 (19Jan13): >>>>> >>>>> But my job is still active. Where is the problem? FD isn't sending >>>>> "keep-alive" packets or 6 days is hardcoded interval of maximum >>>>> running >>>>> time? >>>>> >>>>> In sources I see this(src/lib/bnet.c): >>>>> >>>>> /* >>>>> * ****FIXME**** reduce this to a few hours once >>>>> * heartbeats are implemented >>>>> */ >>>>> bsock->timeout = 60 * 60 * 6 * 24; /* 6 days timeout */ >>>>> >>>>> Is it mean that heartbeat isn't implemented yet? >>>>> >>>>> Now I'm changing that interval to 30 days. >>>>> Is there any more beautiful way? >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> Rapidly troubleshoot problems before they affect your business. >>>>> Most IT >>>>> organizations don't have a clear picture of how application >>>>> performance >>>>> affects their revenue. With AppDynamics, you get 100% visibility >>>>> into your >>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>>>> AppDynamics Pro! >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>>>> >>>>> _______________________________________________ >>>>> Bacula-devel mailing list >>>>> Bac...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel >>>> ------------------------------------------------------------------------------ >>>> >>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>> Critical Workloads, Development Environments & Everything In Between. >>>> Get a Quote or Start a Free Trial Today. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>> >>>> _______________________________________________ >>>> Bacula-devel mailing list >>>> Bac...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel >>>> >> >> >> ------------------------------------------------------------------------------ >> >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> >> _______________________________________________ >> Bacula-devel mailing list >> Bac...@li... >> https://lists.sourceforge.net/lists/listinfo/bacula-devel >> > |