From: Mantis B. T. <no...@bu...> - 2012-10-20 20:55:36
|
The following issue has been SUBMITTED. ====================================================================== http://bugs.bacula.org/view.php?id=1944 ====================================================================== Reported By: alan Assigned To: ====================================================================== Project: bacula Issue ID: 1944 Category: scheduling Reproducibility: random Severity: minor Priority: normal Status: new ====================================================================== Date Submitted: 2012-10-20 21:55 BST Last Modified: 2012-10-20 21:55 BST ====================================================================== Summary: All jobs wait for ending one with separated storage/pool and large count of Maximum Concurrent Jobs and Maximum Concurrent Jobs. Description: All jobs wait for ending one (or 2-3 running) with separated storage/pool and large count of Maximum Concurrent Jobs and Maximum Concurrent Jobs. This problem present on all servers (bacula 5.2.x and centos5-6, ubuntu 12) but not always. list of "status dir" ---------------------------------- Running Jobs: Console connected at 20-Oct-12 23:18 JobId Level Name Status ====================================================================== 67 Virtual vnshw53.hostex.lt.2012-10-19_23.56.55_04 is running 70 RestoreFiles.2012-10-20_18.45.42_53 is waiting for higher priority jobs to finish 71 Increme vnshw33.hostex.lt.2012-10-20_20.01.00_54 is waiting execution 72 Increme vnshw44.hostex.lt.2012-10-20_20.01.00_55 is waiting execution 73 Increme vnshw53.hostex.lt.2012-10-20_21.11.00_56 is waiting execution 74 Full BackupCatalog.2012-10-20_23.10.00_03 is waiting execution ==== ---------------------------------- list of sd status ---------------------------------- Device status: Device "bacula-dir1_FileStorage" (/mnt/cephfs/storage/bacula-dir1_FileStorage) is not open. Device "vnshw21_FileStorage" (/mnt/cephfs/storage/vnshw21_FileStorage) is not open. Device "vnshw33_FileStorage" (/mnt/cephfs/storage/vnshw33_FileStorage) is not open. Device "vnshw44_FileStorage" (/mnt/cephfs/storage/vnshw44_FileStorage) is mounted with: Volume: vnshw53_Label4090 Pool: vnshw53_Pool2 Media type: vnshw44_File Total Bytes=417,586,398 Blocks=6,473 Bytes/block=64,512 Positioned at File=0 Block=417,586,397 Device "vnshw49_FileStorage" (/mnt/cephfs/storage/vnshw49_FileStorage) is not open. Device "vnshw53_FileStorage" (/mnt/cephfs/storage/vnshw53_FileStorage) is mounted with: Volume: vnshw53_Label2915 Pool: *unknown* Media type: vnshw53_File Total Bytes Read=276,885,504 Blocks Read=4,292 Bytes/block=64,512 Positioned at File=0 Block=276,821,227 Device "hosting1.teo.lt_FileStorage" (/mnt/cephfs/storage/hosting1.teo.lt_FileStorage) is not open. ==== ---------------------------------- As you can see all jobs have own storage, pool and label. /mnt/cephfs/storage/SERVERNAME_FileStorage. Pool SERVERNAME_Pool and Label SERVERNAME_Label. Now I test VirualFull and copy him to Next Pool = vnshw53_Pool2 where Storage = vnshw44_Storage. Why jobs 70 and 71 status is "waiting execution"? priority Restore = 9 priority Backup = 10 priority Backup Catalog = 11 All clients have Maximum Concurrent Jobs = 10, SD Maximum Concurrent Jobs = 50, Dir Maximum Concurrent Jobs = 50 This bacula server use CephFS and remote mysql (on SSD). Other servers use local sas/sata disks and local mysql on sata/sas. We have mysql with innodb and myisam. Servers have 10-50 billion files (500G-6T). Additional Information: standard client-dir config file: vnshw33.hostex.lt-dir.conf ---------------------------------- Client { Name = vnshw33.hostex.lt-fd Address = vnshw33.hostex.lt FDPort = 9102 Catalog = MyCatalog Password = "PASS" File Retention = 20 days Job Retention = 20 days AutoPrune = yes Maximum Concurrent Jobs = 10 } #FileSet { # use def file set #} Storage { Name = vnshw33_Storage Address = bacula-dir1.hostex.lt ## Backup Host (FQDN) SDPort = 9103 Password = "PASS" Device = vnshw33_FileStorage Media Type = vnshw33_File Maximum Concurrent Jobs = 10 } Pool { Name = vnshw33_Pool Pool Type = Backup Storage = vnshw33_Storage Recycle = yes AutoPrune = yes Volume Retention = 20 days Maximum Volume Bytes = 1G Maximum Volumes = 2000 Label Format = "vnshw33_Label" } Schedule { Name = "vnshw33_WeeklyCycle2001" Run = Full mon at 20:01 Run = Incremental tue-sun at 20:01 } Job { Name = "vnshw33.hostex.lt" Client = vnshw33.hostex.lt-fd Level = Incremental Type = Backup Write Bootstrap = "/mnt/cephfs/working/vnshw33.hostex.lt.bsr" Pool = vnshw33_Pool FileSet = "Centos_VE_NODE_Def_FileSet" Schedule = "vnshw33_WeeklyCycle2001" Messages = Standard } ---------------------------------- standard client-sd config file: vnshw33.hostex.lt-sd.conf ---------------------------------- Device { Name = vnshw33_FileStorage Media Type = vnshw33_File Archive Device = /mnt/cephfs/storage/vnshw33_FileStorage LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } ---------------------------------- cat bacula-dir.conf | grep -v '^#' ---------------------------------- Director { # define myself Name = bacula-dir1.hostex.lt-dir DIRport = 9101 # where we listen for UA connections QueryFile = "/etc/bacula/scripts/query.sql" WorkingDirectory = "/mnt/cephfs/working" PidDirectory = "/var/run/bacula" Maximum Concurrent Jobs = 50 Password = "PASS" # Console password Messages = Daemon } Job { Name = "BackupCatalog" Client = bacula-dir1.hostex.lt-fd Level = Full Type = Backup Messages = Standard Pool = bacula-dir1_Pool FileSet="Catalog" Schedule = "WeeklyCycleAfterBackup" # This creates an ASCII copy of the catalog # Arguments to make_catalog_backup.pl are: # make_catalog_backup.pl <catalog-name> RunBeforeJob = "/etc/bacula/hx_bin/bacula_dump_mysql.sh" # This deletes the copy of the catalog RunAfterJob = "/etc/bacula/hx_bin/bacula_mysql_cleanup.sh" Priority = 11 # run after main backup } Job { Name = "RestoreFiles" Type = Restore Client=bacula-dir1.hostex.lt-fd FileSet="Centos_HW_Def_FileSet" Pool = bacula-dir1_Pool Messages = Standard Where = /bacula-restores Allow Mixed Priority = yes Priority = 9 } Schedule { Name = "WeeklyCycleAfterBackup" Run = Full sun-sat at 23:10 } FileSet { Name = "Catalog" Include { Options { signature = MD5 } } } Catalog { Name = MyCatalog dbname = "BASE"; DB Address = "bacula-m1.hostex.lt"; dbuser = "USER"; dbpassword = "PASS" } Messages { Name = Standard mailcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula: %t %e of %c %l\" %r" operatorcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula: Intervention needed for %j\" %r" mail = root@localhost = all, !skipped operator = root@localhost = mount console = all, !skipped, !saved append = "/mnt/cephfs/working/log" = all, !skipped catalog = all } Messages { Name = Daemon mailcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula daemon message\" %r" mail = root@localhost = all, !skipped console = all, !skipped, !saved append = "/mnt/cephfs/working/log" = all, !skipped } @/etc/bacula/conf/tpl_def_file_set-dir.conf @/etc/bacula/conf/bacula-dir1.hostex.lt-dir.conf @/etc/bacula/conf/vnshw21.hostex.lt-dir.conf @/etc/bacula/conf/vnshw33.hostex.lt-dir.conf @/etc/bacula/conf/vnshw44.hostex.lt-dir.conf @/etc/bacula/conf/vnshw49.hostex.lt-dir.conf @/etc/bacula/conf/vnshw53.hostex.lt-dir.conf @/etc/bacula/conf/hosting1.teo.lt-dir.conf ---------------------------------- cat bacula-sd.conf | grep -v '^#' ---------------------------------- Storage { # definition of myself Name = bacula-dir1.hostex.lt-sd SDPort = 9103 # Director's port WorkingDirectory = "/mnt/cephfs/working" Pid Directory = "/var/run/bacula" Maximum Concurrent Jobs = 50 } Director { Name = bacula-dir1.hostex.lt-dir Password = "PASS" } Messages { Name = Standard director = bacula-dir1.hostex.lt-dir = all } @/etc/bacula/conf/bacula-dir1.hostex.lt-sd.conf @/etc/bacula/conf/vnshw21.hostex.lt-sd.conf @/etc/bacula/conf/vnshw33.hostex.lt-sd.conf @/etc/bacula/conf/vnshw44.hostex.lt-sd.conf @/etc/bacula/conf/vnshw49.hostex.lt-sd.conf @/etc/bacula/conf/vnshw53.hostex.lt-sd.conf @/etc/bacula/conf/hosting1.teo.lt-sd.conf ---------------------------------- Mysql show processlist ---------------------------------- mysql> show processlist; +--------+-------------+-----------------------------+-------------+---------+------+-------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +--------+-------------+-----------------------------+-------------+---------+------+-------+------------------+ | 242470 | bacula_dir1 | bacula-dir1.hostex.lt:49611 | bacula_dir1 | Sleep | 50 | | NULL | | 242471 | bacula_dir1 | bacula-dir1.hostex.lt:49615 | bacula_dir1 | Sleep | 752 | | NULL | | 255548 | bacula_dir1 | bacula-dir1.hostex.lt:53899 | NULL | Query | 0 | NULL | show processlist | +--------+-------------+-----------------------------+-------------+---------+------+-------+------------------+ 3 rows in set (0.01 sec) ---------------------------------- ====================================================================== Issue History Date Modified Username Field Change ====================================================================== 2012-10-20 21:55 alan New Issue ====================================================================== |