Hello,

 

I'm using kosmos FS as a hypertable DFS backend. There's a problem which occurs after a 10 minutes of idle time in Hypertable->KFS connection. After the idle time Hypertable's KFS client is not able to reach KFS for 60 or 120 seconds. (exact times). After that time everything works normally.

 

Here's the logs from Hypertable part:

 

06-14-2009 05:09:09.402 INFO - (KfsWrite.cc:472) Write failed...chunk = 9, version =

5, offset = 9724, error = -1001

06-14-2009 05:09:09.402 INFO - (KfsWrite.cc:304) Daisy-chain: 127.0.0.1 30000 ; Will

retry allocation/write on chunk 9 due to error code: -1001

06-14-2009 05:10:09.473 INFO - (KfsWrite.cc:372) Forced allocation: chunk=9, version=6

06-14-2009 05:10:09.478 INFO - (KfsWrite.cc:472) Write failed...chunk = 7, version =

8, offset = 2514, error = -1001

06-14-2009 05:10:09.478 INFO - (KfsWrite.cc:304) Daisy-chain: 127.0.0.1 30000 ; Will

retry allocation/write on chunk 7 due to error code: -1001

06-14-2009 05:11:09.532 INFO - (KfsWrite.cc:372) Forced allocation: chunk=7, version=9

 

And related KFS chunk logs:

 

06-14-2009 05:09:09.394 DEBUG - (ClientSM.cc:301) Got command: write-id-alloc: seq =
955 chunkId = 9 chunkversion = 5 servers =
 
06-14-2009 05:09:09.394 INFO - (KfsOps.cc:1257) Write-id alloc: write-id-alloc: seq =
955 chunkId = 9 chunkversion = 5 servers =
06-14-2009 05:09:09.401 INFO - (KfsOps.cc:1343) Sending write-id alloc back (status =
0): write-id-alloc: seq = 955 chunkId = 9 chunkversion = 5 servers =
06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-id-alloc: seq = 955 chunkId = 9 chunkversion = 5 servers = : Response status: 0
 
06-14-2009 05:09:09.402 INFO - (KfsOps.cc:1392) Write prepare failed...as lease
expired for 9
06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-prepare: seq = 956 chunkId = 9 chunkversion = 5 offset: 9724 numBytes: 81:
Response status: -1001
 
06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:301) Got command: write-sync: seq = 957
chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000 191
 
06-14-2009 05:09:09.402 INFO - (ClientSM.cc:308) Received write sync from 127.0.0.1:
write-sync: seq = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000 191
06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:331) Write-sync is being pushed down; no
writes left... (0 ops left in q)
06-14-2009 05:09:09.402 INFO - (KfsOps.cc:1527) Executing write sync: write-sync: seq
= 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000 191
06-14-2009 05:09:09.402 INFO - (KfsOps.cc:1574) Write sync failed...lease expired for 9
06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-sync: seq = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000
191: Response status: -1001
 
06-14-2009 05:09:09.402 INFO - (ClientSM.cc:95) Ack'ing write sync to 127.0.0.1:
write-sync: seq = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000 191
06-14-2009 05:09:19.404 INFO - (LeaseClerk.cc:250) sending lease renew for: chunk=9,
lease=12, lease valid=-167 secs
06-14-2009 05:09:19.405 DEBUG - (LeaseClerk.cc:89) Lease for chunk = 9 unregistered
06-14-2009 05:09:29.406 DEBUG - (ChunkManager.cc:1785) Dir: /KFS/chunk/bin/kfschunk1
has space 16835831292
06-14-2009 05:09:29.406 DEBUG - (ChunkManager.cc:1785) Dir: /KFS/chunk/bin/kfschunk2
has space 16835824245
06-14-2009 05:10:09.467 INFO - (ChunkManager.cc:444) Chunk
/KFS/chunk/bin/kfschunk1/58.9.5 already exists; changing version # to 6; Checksums
are loaded
06-14-2009 05:10:09.467 DEBUG - (LeaseClerk.cc:78) Registered lease: chunk=9, lease=15
06-14-2009 05:10:09.472 DEBUG - (NetConnection.cc:56) Read 0 bytes...connection dropped
06-14-2009 05:10:09.472 INFO - (ClientSM.cc:176) Closing connection from client 127.0.0.1
06-14-2009 05:10:09.473 DEBUG - (ClientSM.cc:301) Got command: size: chunkId = 9
chunkversion = 6
 
06-14-2009 05:10:09.473 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command size:
chunkId = 9 chunkversion = 6: Response status: 0
 
06-14-2009 05:10:09.473 DEBUG - (ClientSM.cc:301) Got command: write-id-alloc: seq =
960 chunkId = 9 chunkversion = 6 servers =
06-14-2009 05:10:09.474 INFO - (KfsOps.cc:1457) Writing to chunk=9, @offset=9724,
nbytes=81, checksum=3524664815
06-14-2009 05:10:09.474 DEBUG - (ChunkManager.cc:823) Write triggered a read for
offset = 9724
06-14-2009 05:10:09.474 DEBUG - (ClientSM.cc:301) Got command: write-sync: seq = 962
chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 192
 
06-14-2009 05:10:09.474 INFO - (ClientSM.cc:308) Received write sync from 127.0.0.1:
write-sync: seq = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 192
06-14-2009 05:10:09.474 DEBUG - (ClientSM.cc:326) Keeping write-sync (962) pending
and depends on 961
06-14-2009 05:10:09.475 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-prepare: seq = 961 chunkId = 9 chunkversion = 6 offset: 9724 numBytes: 81:
Response status: 0
 
06-14-2009 05:10:09.475 DEBUG - (ClientSM.cc:363) Submitting write-sync (962) since
961 finished
06-14-2009 05:10:09.475 INFO - (KfsOps.cc:1527) Executing write sync: write-sync: seq
= 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 192
06-14-2009 05:10:09.475 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-sync: seq = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000
192: Response status: 0
 
06-14-2009 05:10:09.475 INFO - (ClientSM.cc:95) Ack'ing write sync to 127.0.0.1:
write-sync: seq = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 192
06-14-2009 05:10:09.477 DEBUG - (ClientSM.cc:301) Got command: write-id-alloc: seq =
963 chunkId = 7 chunkversion = 8 servers =
 
06-14-2009 05:10:09.477 INFO - (KfsOps.cc:1257) Write-id alloc: write-id-alloc: seq =
963 chunkId = 7 chunkversion = 8 servers =
06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1343) Sending write-id alloc back (status =
0): write-id-alloc: seq = 963 chunkId = 7 chunkversion = 8 servers =
06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-id-alloc: seq = 963 chunkId = 7 chunkversion = 8 servers = : Response status: 0
 
06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1392) Write prepare failed...as lease
expired for 7
06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-prepare: seq = 964 chunkId = 7 chunkversion = 8 offset: 2514 numBytes: 30:
Response status: -1001
 
06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:301) Got command: write-sync: seq = 965
chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000 193
 
06-14-2009 05:10:09.478 INFO - (ClientSM.cc:308) Received write sync from 127.0.0.1:
write-sync: seq = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000 193
06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:331) Write-sync is being pushed down; no
writes left... (0 ops left in q)
06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1527) Executing write sync: write-sync: seq
= 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000 193
06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1574) Write sync failed...lease expired for 7
06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-sync: seq = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000
193: Response status: -1001
 
06-14-2009 05:10:09.478 INFO - (ClientSM.cc:95) Ack'ing write sync to 127.0.0.1:
write-sync: seq = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000 193
06-14-2009 05:10:19.472 INFO - (LeaseClerk.cc:250) sending lease renew for: chunk=7,
lease=14, lease valid=-13 secs
06-14-2009 05:10:19.473 DEBUG - (LeaseClerk.cc:89) Lease for chunk = 7 unregistered
06-14-2009 05:10:29.475 DEBUG - (ChunkManager.cc:1785) Dir: /KFS/chunk/bin/kfschunk1
has space 16835835469
06-14-2009 05:10:29.475 DEBUG - (ChunkManager.cc:1785) Dir: /KFS/chunk/bin/kfschunk2
has space 16835828341
06-14-2009 05:11:09.530 INFO - (ChunkManager.cc:444) Chunk
/KFS/chunk/bin/kfschunk2/43.7.8 already exists; changing version # to 9; Checksums
are loaded
06-14-2009 05:11:09.530 DEBUG - (LeaseClerk.cc:78) Registered lease: chunk=7, lease=16
06-14-2009 05:11:09.532 DEBUG - (NetConnection.cc:56) Read 0 bytes...connection dropped
06-14-2009 05:11:09.532 INFO - (ClientSM.cc:176) Closing connection from client 127.0.0.1
06-14-2009 05:11:09.532 DEBUG - (ClientSM.cc:301) Got command: size: chunkId = 7
chunkversion = 9
 
06-14-2009 05:11:09.532 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command size:
chunkId = 7 chunkversion = 9: Response status: 0
06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:301) Got command: write-id-alloc: seq =
968 chunkId = 7 chunkversion = 9 servers =
 
06-14-2009 05:11:09.533 INFO - (KfsOps.cc:1257) Write-id alloc: write-id-alloc: seq =
968 chunkId = 7 chunkversion = 9 servers =
06-14-2009 05:11:09.533 INFO - (KfsOps.cc:1343) Sending write-id alloc back (status =
0): write-id-alloc: seq = 968 chunkId = 7 chunkversion = 9 servers =
06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-id-alloc: seq = 968 chunkId = 7 chunkversion = 9 servers = : Response status: 0
 
06-14-2009 05:11:09.533 INFO - (KfsOps.cc:1457) Writing to chunk=7, @offset=2514,
nbytes=30, checksum=2165245449
06-14-2009 05:11:09.533 DEBUG - (ChunkManager.cc:823) Write triggered a read for
offset = 2514
06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:301) Got command: write-sync: seq = 970
chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000 194
 
06-14-2009 05:11:09.533 INFO - (ClientSM.cc:308) Received write sync from 127.0.0.1:
write-sync: seq = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000 194
06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:326) Keeping write-sync (970) pending
and depends on 969
06-14-2009 05:11:09.534 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-prepare: seq = 969 chunkId = 7 chunkversion = 9 offset: 2514 numBytes: 30:
Response status: 0
 
06-14-2009 05:11:09.534 DEBUG - (ClientSM.cc:363) Submitting write-sync (970) since
969 finished
06-14-2009 05:11:09.534 INFO - (KfsOps.cc:1527) Executing write sync: write-sync: seq
= 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000 194
06-14-2009 05:11:09.534 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-sync: seq = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000
194: Response status: 0
 
06-14-2009 05:11:09.534 INFO - (ClientSM.cc:95) Ack'ing write sync to 127.0.0.1:
write-sync: seq = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000 194
06-14-2009 05:11:09.537 DEBUG - (ClientSM.cc:301) Got command: write-id-alloc: seq =
971 chunkId = 9 chunkversion = 6 servers =
 
06-14-2009 05:11:09.537 INFO - (KfsOps.cc:1257) Write-id alloc: write-id-alloc: seq =
971 chunkId = 9 chunkversion = 6 servers =
06-14-2009 05:11:09.537 INFO - (KfsOps.cc:1343) Sending write-id alloc back (status =
0): write-id-alloc: seq = 971 chunkId = 9 chunkversion = 6 servers =
06-14-2009 05:11:09.537 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-id-alloc: seq = 971 chunkId = 9 chunkversion = 6 servers = : Response status: 0
 
06-14-2009 05:11:09.537 INFO - (KfsOps.cc:1457) Writing to chunk=9, @offset=9805,
nbytes=80, checksum=1670256541
06-14-2009 05:11:09.537 DEBUG - (ChunkManager.cc:823) Write triggered a read for
offset = 9805
06-14-2009 05:11:09.548 DEBUG - (ClientSM.cc:301) Got command: write-sync: seq = 973
chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 195
 
06-14-2009 05:11:09.548 INFO - (ClientSM.cc:308) Received write sync from 127.0.0.1:
write-sync: seq = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 195
06-14-2009 05:11:09.548 DEBUG - (ClientSM.cc:326) Keeping write-sync (973) pending
and depends on 972
06-14-2009 05:11:09.551 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-prepare: seq = 972 chunkId = 9 chunkversion = 6 offset: 9805 numBytes: 80:
Response status: 0
 
06-14-2009 05:11:09.551 DEBUG - (ClientSM.cc:363) Submitting write-sync (973) since
972 finished
06-14-2009 05:11:09.551 INFO - (KfsOps.cc:1527) Executing write sync: write-sync: seq
= 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 195
06-14-2009 05:11:09.551 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command
write-sync: seq = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000
195: Response status: 0
 
06-14-2009 05:11:09.551 INFO - (ClientSM.cc:95) Ack'ing write sync to 127.0.0.1:
write-sync: seq = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 195

 

(sorry for language mistakes.)

 

Ufuk.



__________ Information from ESET NOD32 Antivirus, version of virus signature database 4168 (20090618) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com