#27 Fatal list-backup error: ERROR: Unhandled exception.

1.x
closed
2015-02-03
2013-08-23
No

I'm getting a fatal error when attempting to list backups for a given server. Here is the output:

barman list-backup db018
ERROR: Unhandled exception. See log file for more details.

And then in the log files I see:

2013-08-23 16:31:31,357 root ERROR: ERROR: Unhandled exception. See log file for more details.
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/barman-1.2.0-py2.6.egg/barman/cli.py", line 453, in main
    p.dispatch(pre_call=global_config, output_file=_output_stream)
  File "/usr/lib/python2.6/site-packages/argh-0.23.1-py2.6.egg/argh/helpers.py", line 47, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/argh-0.23.1-py2.6.egg/argh/dispatching.py", line 121, in dispatch
    for line in lines:
  File "/usr/lib/python2.6/site-packages/argh-0.23.1-py2.6.egg/argh/dispatching.py", line 197, in _execute_command
    for line in result:
  File "/usr/lib/python2.6/site-packages/argh-0.23.1-py2.6.egg/argh/dispatching.py", line 185, in _call
    for line in result:
  File "/usr/lib/python2.6/site-packages/barman-1.2.0-py2.6.egg/barman/cli.py", line 128, in list_backup
    for line in server.list_backups():
  File "/usr/lib/python2.6/site-packages/barman-1.2.0-py2.6.egg/barman/server.py", line 398, in list_backups
    _, backup_wal_size, _, wal_until_next_size, _ = self.get_wal_info(backup)
  File "/usr/lib/python2.6/site-packages/barman-1.2.0-py2.6.egg/barman/server.py", line 508, in get_wal_info
    for name, size in self.get_wal_until_next_backup(backup):
  File "/usr/lib/python2.6/site-packages/barman-1.2.0-py2.6.egg/barman/server.py", line 484, in get_wal_until_next_backup
    name, size, _, _ = self.xlogdb_parse_line(line)
  File "/usr/lib/python2.6/site-packages/barman-1.2.0-py2.6.egg/barman/server.py", line 559, in xlogdb_parse_line
    raise ValueError("cannot parse line: %r" % (line,))
ValueError: cannot parse line: '000000010000146900000073\t5997656\t13770000000010000146D0000004C\t5359802\t1377027617.85\tgzip\n'

If I grep for part of this string in xlog.db, I see:

grep 1377027617.85 xlog.db
000000010000146900000073        5997656 13770000000010000146D0000004C   5359802 1377027617.85   gzip

Which looks a bit different from the other lines. It looks as though one line is clobbering another or joined with the previous line.

If I remove this line, I can proceed with listing backups, but I suspect doing so may invalidate or corrupt a backup.

Please advise.

Thanks,
Damon

Discussion

  • Damon  Snyder

    Damon Snyder - 2013-08-23

    Here is some more context from xlog.db:

    000000010000146900000071        5229494 1377023101.0    gzip
    000000010000146900000072        5007091 1377023114.91   gzip
    000000010000146900000073        5997656 13770000000010000146D0000004C   5359802 1377027617.85   gzip
    000000010000146D0000004D        5028603 1377027629.0    gzip
    000000010000146D0000004E        5303137 1377027648.0    gzip
    

    Damon

     
  • Damon  Snyder

    Damon Snyder - 2013-08-23

    I think there may be a problem with the locking mechanism. I tested barman.lockfile using this script. If you run it a few times you will see similar behavior to what I'm seeing in the logs.

    We have 15 servers configured with barman. Most of them are read heavy. Adding the 15th server and addition write load seemed to push us over a threshold where we stared seeing this issue. We run 'barman cron' every 10 minutes.

     
    Last edit: Damon Snyder 2013-10-28
  • Rafael Martinez

    Rafael Martinez - 2013-10-17

    We are having a similar problem with intermittent "Unhandled exception" errors
    when we try i.e. to get a list of backups for a server under heavy load that it is produccing a lots of WAL files.

    The error message is not 100% the same as the one in this ticket, but the result is.

    We get this error:

    [barman@pg-backup ~]$ barman list-backup dbpg_hotel_utv
    ERROR: Unhandled exception. See log file for more details.

    Under /pg_backup/barman/dbpg_hotel_utv/incoming we have a few thousand
    WAL files waiting to get compressed and archived. Barman is running and compressing/moving WAL files.

    Under /pg_backup/barman/dbpg_hotel_utv/wals, we have over 40000 WAL
    files already archived for this server.

    We get this type of error in the log file:



    [.......]
    2013-10-15 16:06:24,142 barman.backup INFO: Processed file
    /pg_backup/barman/dbpg_hotel_utv/incoming/00000001000001C200000088

    2013-10-15 16:06:24,793 root ERROR: ERROR: Unhandled exception. See log
    file for more details.
    Traceback (most recent call last):
    File "/usr/lib/python2.6/site-packages/barman/cli.py", line 453, in main
    p.dispatch(pre_call=global_config, output_file=output_stream)
    File "/usr/lib/python2.6/site-packages/argh/helpers.py", line 53, in
    dispatch
    return dispatch(self, args, *kwargs)
    File "/usr/lib/python2.6/site-packages/argh/dispatching.py", line 123,
    in dispatch
    for line in lines:
    File "/usr/lib/python2.6/site-packages/argh/dispatching.py", line 199,
    in _execute_command
    for line in result:
    File "/usr/lib/python2.6/site-packages/argh/dispatching.py", line 187,
    in _call
    for line in result:
    File "/usr/lib/python2.6/site-packages/barman/cli.py", line 128, in
    list_backup
    for line in server.list_backups():
    File "/usr/lib/python2.6/site-packages/barman/server.py", line 398, in
    list_backups

    , backup_wal_size, , wal_until_next_size, _ =
    self.get_wal_info(backup)
    File "/usr/lib/python2.6/site-packages/barman/server.py", line 508, in
    get_wal_info
    for name, size in self.get_wal_until_next_backup(backup):
    File "/usr/lib/python2.6/site-packages/barman/server.py", line 484, in
    get_wal_until_next_backup
    name, size,
    , _ = self.xlogdb_parse_line(line)
    File "/usr/lib/python2.6/site-packages/barman/server.py", line 559, in
    xlogdb_parse_line
    raise ValueError("cannot parse line: %r" % (line,))
    ValueError: cannot parse line: '00000001'

    2013-10-15 16:06:24,933 barman.backup INFO: Processed file
    /pg_backup/barman/dbpg_hotel_utv/incoming/00000001000001C200000089

    Is this the same problem? How can we fix it?

    Thanks in advance

     
  • Damon  Snyder

    Damon Snyder - 2013-10-24

    That's the same problem I was have having (note the error is from the same line in the python script). You should be able to resolve the issue (going forward) by upgrading to 1.2.3.

    Unfortunately the xlog.db files may be corrupted. To "fix" them, you'll have to find the corrupted line and either fix it or remove it. That should allow the cron to finish.

    You could do something like grep -n "^00000001$" xlog.db and then delete the line that it finds.

    Hope this helps.

    Damon

     
  • Gabriele Bartolini

    • labels: --> WAL database
    • status: open --> accepted
    • assigned_to: Marco Nenciarini
     
  • Gabriele Bartolini

    The most sensible thing to do is for Barman to rebuild the WAL archive. We will work on something for the upcoming 1.2.4 release.

     
  • Gabriele Bartolini

    • status: accepted --> closed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks