From: Andreas H. <ahn...@el...> - 2007-05-31 21:14:19
|
Hello Tom, since you aren't making the more obvious mistakes, it *might* be the case, that bacula is making the mistake. So let me bring up a few more, more subtle, points :-) 1. I had some problem with tapeinfo resetting my scsi-bus so I would recommend to comment it out until everything else is running. You might also check if tapeinfo -f /dev/scsi/changer/c2t0d0 works flawlessly (correct output, no messages in syslog) BTW: I don't understand why the example tapealert line checks for errors of the *changer* instead of checking the tapedrive. 2. Have you checked, if your problem is really a problem of the job? Are this jobs failing every time or only sometimes? Have you tried to reschedule them, after they have failed? 3. There has recently been some discussion about a *possible* bug, which *might* be the cause for bacula not to find the correct tape for a job while using multidrive autochangers. So, if you can risk to use beta software in your configuration i would suggest you give V2.1.10 a try. (Just a warning: This version seems to have still some bugs with multidrive autochangers (e.g http://bugs.bacula.org/view.php?id=864), but it also seems that this bug(s) are less serious then the one(s) in V2.0.3) And don't use the reload command with version 2.1.10., it has a bug. Just for clarification: This is only my personal experience (using V 2.1.10 for about 10 days) and for *me* it works better with multidrive autochanger than v2.0.3. If you consider using beta software in production environment you do it on your one risk ;-) If you, or anyone else, is able to find more hints on how to reproduce this bug in a test setup of V2.1.10, please add a note to the bugreport to help fix the problem. Andreas tom...@pr... wrote: > Andreas, > >> -----Original Message----- >> From: bac...@li... [mailto:bacula-users- >> bo...@li...] On Behalf Of Andreas Helmcke >> Sent: Wednesday, May 30, 2007 6:03 PM >> To: bac...@li... >> Subject: Re: [Bacula-users] Problems with autochanger and bacula 2.0.3 >> >> Hello, >> >> tom...@pr... wrote: >>> Hello all, >>> >>> I am currently running bacula v2.0.3 on a Solaris 9 system with a >>> Qualstar RLS-8236 Tape Library. The Library has 2 LTO-2 tape drives. >>> Bacula has worked okay, but recently (after upgrading to 2.0.3 from >>> 1.38.11) I am getting backup errors on a couple of clients. >>> >>> [...] >>> >>> Here ia my autochanger and drive configuration from bacula-sd.conf: >>> >>> # An autochanger device with two drives >>> # >>> Autochanger { >>> Name = Autochanger-0 >>> Device = LTO-0 >>> Device = LTO-1 >>> Changer Command = "/usr/local/bacula/etc/mtx-changer %c %o %S %a > %d" >>> Changer Device = /dev/scsi/changer/c2t0d0 } >>> >>> Device { >>> Name = LTO-0 >>> Drive Index = 0 >>> Media Type = LTO-2 >>> Archive Device = /dev/rmt/0cbn >>> AutomaticMount = yes; # when device opened, read it >>> AlwaysOpen = yes; >>> RemovableMedia = yes; >>> RandomAccess = no; >>> AutoChanger = yes >>> Autoselect = yes # Default is yes but not using both >>> drives >>> Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" >>> Spool Directory = /local0/BACKUP >>> Maximum Spool Size=16777216000 >>> Maximum Job Spool Size=10485760000 >>> Maximum Network Buffer Size = 65536 >>> } >>> >>> Device { >>> Name = LTO-1 >>> Drive Index = 1 >>> Media Type = LTO-2 >>> Archive Device = /dev/rmt/1cbn >>> AutomaticMount = yes; # when device opened, read it >>> AlwaysOpen = yes; >>> RemovableMedia = yes; >>> RandomAccess = no; >>> AutoChanger = yes >>> Autoselect = yes # Default is yes but not using both >>> drives >>> Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" >>> Spool Directory = /local3/BACKUP >>> Maximum Spool Size=29360128000 >>> Maximum Job Spool Size=19922944000 >>> Maximum Network Buffer Size = 65536 >>> } >>> >>> Here are the Daily01 and Daily02 pool definitions: >>> >>> Pool { >>> Name = Daily01 >>> Pool Type = Backup >>> Recycle = yes # Bacula can automatically recycle >>> Volumes >>> AutoPrune = yes # Prune expired volumes >>> Volume Use Duration = 21 days >>> Volume Retention = 60 days # 2 Months >>> #Accept Any Volume = yes # write on any volume in the pool >>> Cleaning Prefix = Clean >>> } >>> >>> Pool { >>> Name = Daily02 >>> Pool Type = Backup >>> Recycle = yes # Bacula can automatically recycle >>> Volumes >>> AutoPrune = yes # Prune expired volumes >>> Volume Use Duration = 21 days >>> Volume Retention = 60 days # 2 Months >>> #Accept Any Volume = yes # write on any volume in the pool >>> Cleaning Prefix = Clean >>> } >> This looks correct. >> >>> If you need more information please let me know. >>> >> Storage and Jobdefintion in bacula-dir.conf would be helpful. > > Here is the Storage, Job, JobDefs and Schedule for the systems that are > failing. I also included the Job and JobDefs for one of the systems that > *is* working with the *02 pools. > > # Definition of LTO tape storage device > Storage { > Name = Autochanger-0 > # Do not use "localhost" here > Address = 172.16.10.45 # N.B. Use a fully qualified > name here > SDPort = 9103 > Password = .......... > Device = Autochanger-0 > Media Type = LTO-2 > Autochanger = yes > Maximum Concurrent Jobs = 4 > } > > JobDefs { > Name = "Windows-02" > Type = Backup > Level = Incremental > Storage = Autochanger-0 > Pool = Weekly02 > Messages = Standard > Priority = 10 > Prefer Mounted Volumes = No > } > > Job { > Name = "Grumpy" > JobDefs = "Windows-02" > Client = Grumpy > FileSet = "Daou Standard" > Schedule = "DailyCycle02-3" > SpoolData = yes > Write Bootstrap = "/usr/local/bacula/var/bacula/working/Grumpy.bsr" > } > > Schedule { > Name = "DailyCycle02-3" > Run = Level=Full Pool=Monthly02 4th sun at 2:40 > Run = Level=Full Pool=Weekly02 1st sun at 3:40 > Run = Level=Differential Pool=Weekly02 2nd-5th sat at 3:40 > Run = Level=Incremental Pool=Daily02 FullPool=Weekly02 mon-fri at 3:40 > } > > Job { > Name = "Sleepy" > JobDefs = "Windows-02" > Client = Sleepy > FileSet = "Sleepy" > Schedule = "DailyCycle02-2" > SpoolData = yes > Write Bootstrap = "/usr/local/bacula/var/bacula/working/Sleepy.bsr" > } > > Schedule { > Name = "DailyCycle02-2" > Run = Level=Full Pool=Monthly02 4th sun at 2:20 > Run = Level=Full Pool=Weekly02 1st sun at 3:20 > Run = Level=Differential Pool=Weekly02 2nd-5th sat at 3:20 > Run = Level=Incremental Pool=Daily02 FullPool=Weekly02 mon-fri at 3:20 > } > > Here is the information for one of the systems that *does not* fail, it > uses the same JobDefs as the failing jobs: > > > Job { > Name = "Happy" > JobDefs = "Windows-02" > Client = Happy > FileSet = "Happy" > Schedule = "DailyCycle02-1" > SpoolData = yes > Write Bootstrap = "/usr/local/bacula/var/bacula/working/Happy.bsr" > } > > Schedule { > Name = "DailyCycle02-1" > Run = Level=Full Pool=Monthly02 4th sun at 2:08 > Run = Level=Full Pool=Weekly02 1st sun at 3:06 > Run = Level=Differential Pool=Weekly02 2nd-5th sat at 3:06 > Run = Level=Incremental Pool=Daily02 FullPool=Weekly02 mon-fri at 3:06 > } > >> Please note, that for v2.0.3 to work correctly with autochangers you >> shouldn't >> write to the drives directly but always use the autochanger-device. > > That was the change I made when I upgraded from 1.38.11 to 2.0.3 was to > use the Autochanger-0 for the Storage directive in the JobDefs. > > Is there any other debugging I could enable in the deamons to see what > is going on during the backup? > |