From: Mantis B. T. <no...@bu...> - 2011-01-28 10:23:30
|
The following issue has been SUBMITTED. ====================================================================== http://bugs.bacula.org/view.php?id=1690 ====================================================================== Reported By: mnalis Assigned To: ====================================================================== Project: bacula Issue ID: 1690 Category: File Daemon Reproducibility: always Severity: major Priority: normal Status: new ====================================================================== Date Submitted: 2011-01-28 10:23 GMT Last Modified: 2011-01-28 10:23 GMT ====================================================================== Summary: memory leak in bacula-fd (with accurate=yes) Description: It seems that after a few days of running incremental backups with accurate=yes, bacula-fd will not return memory to the system after the backup completes (as it should, and as it does on the first run). Interesting thing is it seems that first incremental run will release the memory back to the system correctly, but second and subsequent ones won't. Steps to Reproduce: have bacula do at least two incremental backups with accurate=yes. wait for backups to finish. Check the memory usage of process bacula-fd with ps(1), it will show extreme memory usage. Here is example problem we've had: 1) 2011-01-26 about 15h - bacula-fd is started 2) 2011-01-26 about 23:52, bacula-fd incremental job completes 3) 2011-01-27 at 17:30, status is checked - memory usage is normal - client# ps auxf | grep bacula- root 6547 3.0 0.0 81928 2172 ? Ssl Jan26 52:41 /usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf - bconsole* status client=rigel-fd Connecting to Client rigel-fd at 161.53.160.6:9102 rigel-fd Version: 5.0.3 (30 August 2010) x86_64-pc-linux-gnu debian 5.0.6 Daemon started 26-Jan-11 13:00. Jobs: run=1 running=0. Heap: heap=249,856 smbytes=20,653 max_bytes=1,617,771,856 bufs=59 max_bufs=13,235 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 Running Jobs: Director connected at: 27-Jan-11 17:35 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ====================================================================== 81216 Incr 290,924 15.97 G OK 17-Jan-11 23:31 rigel 81400 Incr 100,344 16.06 G OK 18-Jan-11 23:36 rigel 81582 Incr 90,352 15.59 G OK 19-Jan-11 23:32 rigel 81764 Incr 105,314 15.25 G OK 20-Jan-11 23:10 rigel 81915 Diff 763,162 22.64 G OK 21-Jan-11 23:04 rigel 82129 Incr 41,646 13.32 G OK 22-Jan-11 22:32 rigel 82311 Incr 71,603 11.70 G OK 23-Jan-11 22:27 rigel 82494 Incr 376,851 16.33 G OK 24-Jan-11 23:39 rigel 82679 Incr 128,509 15.96 G OK 26-Jan-11 00:08 rigel 82858 Incr 235,962 16.74 G OK 26-Jan-11 23:52 rigel ==== - Correlating our munin graph show that about 1.5+GB of extra apps memory was indeed allocated during the night, and released about 6 hours later, which correlates to the time bacula backup was running; so memory was released to the system at about the moment backup ended) 4) 2011-01-27 about 23:55, another bacula-fd incremental job completes 5) 2011-01-28 at 10:29, status is checked - memory usage is extreme (1.5+GB) - client# ps auxf | grep bacula- root 6547 3.5 37.2 1702220 1559836 ? Ssl Jan26 97:30 /usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf - bconsole* status client=rigel-fd Connecting to Client rigel-fd at 161.53.160.6:9102 rigel-fd Version: 5.0.3 (30 August 2010) x86_64-pc-linux-gnu debian 5.0.6 Daemon started 26-Jan-11 13:00. Jobs: run=2 running=0. Heap: heap=1,059,315,712 smbytes=95,399 max_bytes=1,624,308,350 bufs=81 max_bufs=25,952 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 Running Jobs: Director connected at: 28-Jan-11 10:29 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ====================================================================== 81400 Incr 100,344 16.06 G OK 18-Jan-11 23:36 rigel 81582 Incr 90,352 15.59 G OK 19-Jan-11 23:32 rigel 81764 Incr 105,314 15.25 G OK 20-Jan-11 23:10 rigel 81915 Diff 763,162 22.64 G OK 21-Jan-11 23:04 rigel 82129 Incr 41,646 13.32 G OK 22-Jan-11 22:32 rigel 82311 Incr 71,603 11.70 G OK 23-Jan-11 22:27 rigel 82494 Incr 376,851 16.33 G OK 24-Jan-11 23:39 rigel 82679 Incr 128,509 15.96 G OK 26-Jan-11 00:08 rigel 82858 Incr 235,962 16.74 G OK 26-Jan-11 23:52 rigel 83037 Incr 300,308 15.45 G OK 27-Jan-11 23:55 rigel ==== - correlating our muning graphs show that the apps memory increased by 1.5GB+ when bacula backup started, but was not returned to the system even 10 hours after the backup was finished (that is 16+ hours since the backup started). Our previous experience shows that the memory usage won't go down (but it also won't go up, at least not by much) until bacula-fd is restarted. Additional Information: bacula-dir, bacula-sd, and bacula-fd are all the same version 5.0.3 + updates from GIT Branch-5.0 as of 2010-11-22. The same bug was observed if bacula-fd is replaced with pure 5.0.2 (from debian backports, version 5.0.2-1~bpo50+1), with sd and dir staying at 5.0.3+git. client running bacula-fd has 4GB of RAM, and backs up about 6.5 million files in full backup. This bug might (or might not) be related to bug#1686 (which was rejected due to unsupported mix of versions and other deficiencies). It might not be related to accurate=yes, but since the bacula-fd memory usage is *much* lower without it, we wouldn't have noticed it as a problem... Let me know if I can gather more data to help troubleshoot this or if you have patches for me to try. Thanks! ====================================================================== Issue History Date Modified Username Field Change ====================================================================== 2011-01-28 10:23 mnalis New Issue ====================================================================== |