There appears to be an issue where the
list-backup command is failing intermittently when executed while a
cron command is being run in parallel in a configuration with multiple databases. Sample log output can be found here.
I believe the issue is caused by IO buffering in the
cron method in cli.py. In this loop:
with lockfile(filename) as locked: if not locked: yield "ERROR: Another cron is running" raise SystemExit, 1 else: servers = [ Server(conf) for conf in barman.__config__.servers()] for server in servers: for lines in server.cron(verbose=True): yield lines
backup_manager.cron() which does
with self.server.xlogdb('a') as fxlogdb: ... # this is the only write to fxlogdb fxlogdb.write("%s\t%s\t%s\t%s\n" % (basename, size, time, self.config.compression))
The call to
self.server.xlogdb('a') uses the context manager interface to manage the lock file around the open file. The file is opened with
open(xlogdb, mode) which indicates (through omission of the third parameter) that the IO on the file will be buffered.
There is no closing or flushing of the opened xlogdb. If I'm not mistaken, once the
for server in servers loop completes in
cron you could have buffered data that still needs to be written to the xlogdb when the lock is relinquished and another loop begins.
I think this may be the cause of these errors. When I investigate these errors in the absence of a running cron task
list-backup completes successfully and the xlog.db appears normal.