From: Pasi K. <pa...@ik...> - 2009-02-09 16:27:47
|
On Mon, Feb 09, 2009 at 04:58:02PM +0100, Kern Sibbald wrote: > These kinds of errors are not good as I said before. Something is going > wrong. > > It seems to me that you are writing to network mounted Volumes. Please remind > me if this is the case. If it is, I am not willing to spend any time on this > unless you can reproduce it on a locally mounted Volumes. > > Bacula does a lot of seeking around (forwards and backwards) on Volumes, and > network mounted filesystems in my experience (both Samba and NFS) do not > properly implement standard Unix filesystem calls (Samba does opportunistic > locking so data is inconsistent on the two sides, which is absurd IMO; and > NFS at a minimum does not implement ftruncate; one or both are or were > limited to 4GB addresses -- who knows what other horrors are in their code). > Doing a simple copy or a simple sequential read will surely work, but I'm not > sure they can handle Bacula. > I'm using ext3 on iSCSI LUN. So all the filesystem operations work just like with locally attached storage. I don't have any SCSI/iSCSI errors in dmesg/syslog.. > See below ... > > On Monday 09 February 2009 15:52:52 Pasi Kärkkäinen wrote: > > On Thu, Jan 29, 2009 at 06:01:52PM +0100, Kern Sibbald wrote: > > > OK, thanks for the feedback. From what I read, the problem is now > > > resolved. If it is, OK no need to respond. If it is not, please let me > > > know. > > > > Actually it seems I'm unfortunately still getting these errors.. now it's > > with disk volumes created and also being copied (to tape) with the same > > Bacula 2.5.29 version. > > > > Log of the copy job: > > http://pasik.reaktio.net/bacula/debug/bacula-copy-job-volume-data-errors.tx > >t > > > > (Also attached to this mail). > > > > Original job: > > > > *list jobid=9126 > > +-------+---------+---------------------+------+-------+----------+-------- > >--------+-----------+ > > > > | JobId | Name | StartTime | Type | Level | JobFiles | > > | JobBytes | JobStatus | > > > > +-------+---------+---------------------+------+-------+----------+-------- > >--------+-----------+ > > > > | 9,126 | server1 | 2009-02-07 08:03:07 | B | F | 209,761 | > > | 35,221,140,175 | T | > > > > +-------+---------+---------------------+------+-------+----------+-------- > >--------+-----------+ > > > > > > Copied job: > > > > *list jobid=9178 > > +-------+---------+---------------------+------+-------+----------+-------- > >--------+-----------+ > > > > | JobId | Name | StartTime | Type | Level | JobFiles | > > | JobBytes | JobStatus | > > > > +-------+---------+---------------------+------+-------+----------+-------- > >--------+-----------+ > > > > | 9,178 | server1 | 2009-02-07 08:03:07 | C | F | 209,761 | > > | 35,259,190,600 | T | > > > > +-------+---------+---------------------+------+-------+----------+-------- > >--------+-----------+ > > > > > > So.. at least the amount of files is the same for the original job and the > > new copied job. > > > > I'm also wondering about the "Termination: Copying OK" when there are > > _errors_ during the copy? > > Bacula believes that it has recovered from the Error. If the SD Errors line > does not show some count, please report it as a bug. > SD Errors: 0 SD termination status: OK Termination: Copying OK -- Pasi |