Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#22 kfs_fuse fails if server is in recovery

open
nobody
5
2008-12-28
2008-12-28
Eric Holmberg
No

Failure During Recovery
-----------------------
When the cluster has just been started and fuse is mounted and a read is attempted, the kfs_fuse code will fail a lease assertion:

kfs_fuse: ~/kfs/trunk/src/cc/libkfsClient/LeaseClerk.cc:81: bool KFS::LeaseClerk::ShouldRenewLease(KFS::kfsChunkId_t): Assertion `iter != mLeases.end()' failed.

Steps to Reproduce:
1. Place a large file (I used a 12GB text file) on the file system
2. Stop servers
3. Open several command shells and run the following command in sequence within several seconds of each other
a. ./trunk/scripts/kfslaunch.py -f machines.cfg --start
b. ./trunk/scripts/gdb --args kfs_fuse kfs-fuse -f
c. ./trunk/scripts/kfs-fuse/md5sum large-file.txt

Version tested:
SVN Rev 230, Last Changed Date: 2008-12-26 16:56:40 -0700 (Fri, 26 Dec 2008)

System info:
Ubuntu 8.10, Kernel 2.6.27-9-generic, amd64

Build settings:
g++ 4.3.2
cmake -Wno-dev -DJAVA_INCLUDE_PATH=/usr/lib/jvm/java-6-sun-1.6.0.10/include -DJAVA_INCLUDE_PATH2=/usr/lib/jvm/java-6-sun-1.6.0.10/include/linux -DUSE_STATIC_LIB_LINKAGE=off -DCMAKE_BUILD_TYPE=Debug -DDEBUG=1 ../

Metaserver Log
--------------
12-28-2008 10:44:12.327 DEBUG - (ClientSM.cc:82) Command getlayout: fid = 14, Status: 0
12-28-2008 10:44:12.329 DEBUG - (ClientSM.cc:195) Got command: getalloc: fid = 14 offset = 0
12-28-2008 10:44:12.329 DEBUG - (ClientSM.cc:82) Command getalloc: fid = 14 offset = 0, Status: 0
12-28-2008 10:44:12.330 DEBUG - (ClientSM.cc:195) Got command: lease acquire: read lease chunkId = 574
12-28-2008 10:44:12.330 INFO - (LayoutManager.cc:941) GetChunkReadLease: inRecovery() => EBUSY
12-28-2008 10:44:12.330 DEBUG - (ClientSM.cc:82) Command lease acquire: read lease chunkId = 574, Status: -16
12-28-2008 10:45:12.330 DEBUG - (ClientSM.cc:195) Got command: lease acquire: read lease chunkId = 574
12-28-2008 10:45:12.330 INFO - (LayoutManager.cc:941) GetChunkReadLease: inRecovery() => EBUSY
12-28-2008 10:45:12.330 DEBUG - (ClientSM.cc:82) Command lease acquire: read lease chunkId = 574, Status: -16
12-28-2008 10:46:12.330 DEBUG - (ClientSM.cc:195) Got command: lease acquire: read lease chunkId = 574
12-28-2008 10:46:12.330 INFO - (LayoutManager.cc:941) GetChunkReadLease: inRecovery() => EBUSY
12-28-2008 10:46:12.331 DEBUG - (ClientSM.cc:82) Command lease acquire: read lease chunkId = 574, Status: -16

Chunkserver Log
---------------
12-28-2008 10:44:12.329 DEBUG - (ClientSM.cc:86) Command size: chunkId = 761 chunkversion = 1: Response status: 0
12-28-2008 10:44:12.329 DEBUG - (NetConnection.cc:66) Read 0 bytes...connection dropped
12-28-2008 10:44:12.330 DEBUG - (ClientSM.cc:295) Got command: size: chunkId = 574 chunkversion = 1
12-28-2008 10:44:12.330 DEBUG - (ClientSM.cc:86) Command size: chunkId = 574 chunkversion = 1: Response status: 0

Starting program: ~/kfs/trunk/build/bin/kfs_fuse kfs-fuse -f
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffd8ed6c6f0 (LWP 19118)]
[New Thread 0x4117b950 (LWP 19124)]
[New Thread 0x427f7950 (LWP 19125)]
[New Thread 0x40938950 (LWP 19129)]
[New Thread 0x41b2b950 (LWP 19130)]
kfs_fuse: ~/kfs/trunk/src/cc/libkfsClient/LeaseClerk.cc:81: bool KFS::LeaseClerk::ShouldRenewLease(KFS::kfsChunkId_t): Assertion `iter != mLeases.end()' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x41b2b950 (LWP 19130)]
0x00007ffd8d5eafd5 in raise () from /lib/libc.so.6
(gdb) backtrace
#0 0x00007ffd8d5eafd5 in raise () from /lib/libc.so.6
#1 0x00007ffd8d5ecb43 in abort () from /lib/libc.so.6
#2 0x00007ffd8d5e3d49 in __assert_fail () from /lib/libc.so.6
#3 0x0000000000474a9c in KFS::LeaseClerk::ShouldRenewLease (this=0x95d070, chunkId=574) at ~/kfs/trunk/src/cc/libkfsClient/LeaseClerk.cc:81
#4 0x0000000000471988 in KFS::KfsClientImpl::IsChunkLeaseGood (this=0x95d030, chunkId=574) at ~/kfs/trunk/src/cc/libkfsClient/KfsRead.cc:195
#5 0x0000000000472e5c in KFS::KfsClientImpl::IsChunkReadable (this=0x95d030, fd=5) at ~/kfs/trunk/src/cc/libkfsClient/KfsRead.cc:183
#6 0x0000000000472fcf in KFS::KfsClientImpl::Read (this=0x95d030, fd=5, buf=0x7ffd88000950 "\200", numBytes=16384) at ~/kfs/trunk/src/cc/libkfsClient/KfsRead.cc:112
#7 0x0000000000453f9c in KFS::KfsClient::Read (this=0x925610, fd=5, buf=0x7ffd88000950 "\200", numBytes=16384) at ~/kfs/trunk/src/cc/libkfsClient/KfsClient.cc:318
#8 0x0000000000447705 in fuse_read (path=0x7ffd88004960 "/dtn_daily2.tsv", buf=0x7ffd88000950 "\200", nread=16384, off=0, finfo=0x41b2b0a0)
at ~/kfs/trunk/src/cc/fuse/kfs_fuse_main.cc:132
#9 0x00007ffd8e95e435 in ?? () from /lib/libfuse.so.2
#10 0x00007ffd8e962ea9 in ?? () from /lib/libfuse.so.2
#11 0x00007ffd8e96083f in ?? () from /lib/libfuse.so.2
#12 0x00007ffd8e4fc3ea in start_thread () from /lib/libpthread.so.0
#13 0x00007ffd8d69ec6d in clone () from /lib/libc.so.6
#14 0x0000000000000000 in ?? ()

Discussion