From: Kenneth P. <sh...@se...> - 2010-11-29 00:27:25
|
I've put a backtrace of a segfault in restore 0.4b43 here: <http://pastebin.com/kBhdnMxb> I'm verifying a backup of an entire partition from a file on a mounted USB drive. The file being compared exists, but it looks like one that's probably been logrotated since the backup. Summary: Core was generated by `/sbin/restore -C -l -L 10000 -b 64 -f /mnt/Backup/0/root/dump -a'. #0 readxattr (buffer=0xbfd76bf8 "") at tape.c:1294 1294 if (curfile.dip->di_size > XATTR_MAXSIZE) { (gdb) bt #0 readxattr (buffer=0xbfd76bf8 "") at tape.c:1294 #1 0x080548bc in compareattr (name=0x8069ec7 "./var/log/named/queries") at tape.c:1731 #2 0x0805789d in comparefile (name=0x8069ec7 "./var/log/named/queries") at tape.c:1946 #3 0x0805025a in compare_entry (ep=0x18633ed0, do_compare=1) at restore.c:694 #4 0x0805049f in compareleaves () at restore.c:748 #5 0x0804ed98 in main (argc=Cannot access memory at address 0x0 ) at main.c:475 |
From: Kenneth P. <sh...@se...> - 2010-12-03 04:58:27
|
--On Sunday, November 28, 2010 4:26 PM -0800 Kenneth Porter <sh...@se...> wrote: ># 0 readxattr (buffer=0xbfd76bf8 "") at tape.c:1294 > 1294 if (curfile.dip->di_size > XATTR_MAXSIZE) { I got a chance to start to look at this and I see this: (gdb) print curfile $1 = {name = 0x805e22d "EA block", ino = 0, dip = 0x0, action = 3 '\003'} Is dip supposed to have a value in this case, when comparing extended attributes? |
From: Kenneth P. <sh...@se...> - 2010-12-06 06:00:30
|
Workaround patch added to ticket here: <https://sourceforge.net/tracker/?func=detail&aid=3129314&group_id=1306&atid=101306> It looks like the skipped test is trying to avoid a buffer overrun, so a more comprehensive fix that correctly acquires the buffer size is still needed. |
From: Kenneth P. <sh...@se...> - 2010-12-06 23:07:01
|
--On Sunday, December 05, 2010 10:00 PM -0800 Kenneth Porter <sh...@se...> wrote: > It looks like the skipped test is trying to avoid a buffer overrun, so a > more comprehensive fix that correctly acquires the buffer size is still > needed. Stelian, I'm guessing you're swamped in other work. If you can coach me on what was intended and where to look for how that structure gets populated, I might be able to go back and fix the underlying issue. |
From: Stelian P. <st...@po...> - 2010-12-06 15:57:10
|
Hi Kenneth, I've finally found out a few minutes to look at your reported bug, and I'm not sure what happens here: > #0 readxattr (buffer=0xbfd76bf8 "") at tape.c:1294 > 1294 if (curfile.dip->di_size > XATTR_MAXSIZE) { As you wrote in a follow-up, curfile is NULL here, exhibiting the bug. > (gdb) bt > #0 readxattr (buffer=0xbfd76bf8 "") at tape.c:1294 > #1 0x080548bc in compareattr (name=0x8069ec7 "./var/log/named/queries") > at tape.c:1731 > #2 0x0805789d in comparefile (name=0x8069ec7 "./var/log/named/queries") > at tape.c:1946 ... but it wasn't NULL at the beginning of comparefile(), so we can assume that somehow during the file extraction (getfile(), called from tape.c:1909), restore finds out an inode (in findinode()) which is NOT a TS_INODE (so curfile.dip and curfile.ino are not initialized) but which satisfies the test on line 1718: "spcl.c_flags & DR_EXTATTRIBUTES"... Are you able to reproduce the problem ? If you do, you could use gdb, based on the analysis above, to see what happens... > #3 0x0805025a in compare_entry (ep=0x18633ed0, do_compare=1) at > restore.c:694 > #4 0x0805049f in compareleaves () at restore.c:748 > #5 0x0804ed98 in main (argc=Cannot access memory at address 0x0 > ) at main.c:475 Thanks, Stelian. -- Stelian Pop <st...@po...> |
From: Kenneth P. <sh...@se...> - 2010-12-06 23:43:17
|
--On Monday, December 06, 2010 4:57 PM +0100 Stelian Pop <st...@po...> wrote: > As you wrote in a follow-up, curfile is NULL here, exhibiting > the bug. Minor correction: It's curfile.dip, not curfile, that's NULL. With my workaround patch I was able to get my verify to complete, so now I'll see if a subsequent backup elicits the error and try your suggestions. |
From: Stelian P. <st...@po...> - 2010-12-07 14:06:17
|
On Mon, Dec 06, 2010 at 03:42:54PM -0800, Kenneth Porter wrote: > --On Monday, December 06, 2010 4:57 PM +0100 Stelian Pop > <st...@po...> wrote: > > >As you wrote in a follow-up, curfile is NULL here, exhibiting > >the bug. > > Minor correction: It's curfile.dip, not curfile, that's NULL. Yes, it is a typo, the rest of the analysis is still valid. > With my workaround patch I was able to get my verify to complete, so > now I'll see if a subsequent backup elicits the error and try your > suggestions. Ok, thanks ! Stelian. -- Stelian Pop <st...@po...> |
From: Kenneth P. <sh...@se...> - 2010-12-23 13:30:36
|
I just wanted to follow up to say I haven't seen the problem recur, so I haven't been able to debug it further. My original backup had gotten far enough away from the disk contents that the restore was failing from too many miscompares so I figured I'd put the media back into rotation until I saw the problem again. |
From: Stelian P. <st...@po...> - 2010-12-27 13:36:34
|
Hi Kenneth, On Thu, Dec 23, 2010 at 05:29:51AM -0800, Kenneth Porter wrote: > I just wanted to follow up to say I haven't seen the problem recur, so I > haven't been able to debug it further. My original backup had gotten far > enough away from the disk contents that the restore was failing from too > many miscompares so I figured I'd put the media back into rotation until I > saw the problem again. Thanks for the update ! Stelian. -- Stelian Pop <st...@po...> |