Re: [Bacula-users] Bacula don't properly finish it's backups

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Am 15.12.2010 14:50, schrieb John Drescher:
> On Wed, Dec 15, 2010 at 8:09 AM, Marc Richter <ric...@gm...> wrote:
>> Hi there,
>>
>> we are running bacula since several years now. We never had any
>> problems, which we weren't capable of understanding and solving on
>> ourselves. But now such a thing happened and I'd really appreciate any
>> help on this (even ideas!):
>>
>> We have moved to another ISP with all of our servers (~30). At that
>> time, we also changed the networking-structure from one big class C net
>> to several subnets.
>>
>> All of our servers are running the same bacula-fd version, which are
>> configured equal, too. All but 2 of them are perfectly working. These
>> two not working have the problem, that it "seems" as if the backups are
>> running, but as if they somehow are not finishing correctly.
>>
>> First, let me show you a mail, we get after each backup from a host,
>> which is identical in hardware- and network-configuration to one of the
>> failing nodes:
>>
>> http://pastebin.com/cdKJ0jua
>>
>> This is the Mail we get from the failing node:
>>
>> http://pastebin.com/qzT9tFXw
>>
>> As you might notice, this trial is running for more than 2 hours and
>> using several 4 GB - media. So the backup seems to be done. But as you
>> see in the above quoted mail (which also has the subject "Bacula: Full
>> Backup Fatal Error fuer emyn-fd") the job is failing.
>>
> 
> Did you manually cancel the failed job? The other backup is an
> incremental so you can not compare the time it should take versus the
> time to do a Full backup. To me it does not seem to be done at that
> point.
> 
> John

No, I didn't. It's every night the same and happens without any user
interaction.

I didn't want to say, that the failing job is running fine and
completely. I Just wanted to point to the fact, that during these 2
hours _something_ is transfered and that therefore the reason can't be a
completely failing networking config or such. It is also not the case,
that the node is sending no data for over 2 hours, since several 4 GB
media are filled during this time.
It is for sure not a complete Backup! When you calculate the media's
summed size you are at round about 32 GB.
The (uncompressed) filesize on the node is something around 371 GB. And
no GZIP could compress this strong...

It because of ... something just stops running :(

Best regards,
Marc