Re: [Kosmosfs-devel] [Kosmosfs-users] Hypertable to KFS connection problem after 10 minutes of idle
Status: Alpha
Brought to you by:
sriramsrao
From: Sriram R. <sri...@gm...> - 2009-06-18 21:08:03
|
HI Mateusz, Looking thru the debug messages one more time, is the sequence: <write X bytes to KFS> <sleep 11 mins say> < write 1 byte> : KFS chunkserver responds with lease expired (-1001) < client sleeps for ~60 secs> < retries the write>; retry works? Sriram On Thu, Jun 18, 2009 at 1:18 PM, Mateusz Berezecki<mat...@gm...> wrote: > Hi Sriram, > > I've tried 360 and it did not help. > > Mateusz > > On Thu, Jun 18, 2009 at 10:04 PM, Sriram Rao<sri...@gm...> wrote: >> Can you try KFS trunk? There was a fix that went in for revision 359 >> to address this sort of problem. >> >> Sriram >> >> On Thu, Jun 18, 2009 at 11:44 AM, Ufuk Altinok<wwa...@ya...> wrote: >>> Hello, >>> >>> >>> >>> I'm using kosmos FS as a hypertable DFS backend. There's a problem which >>> occurs after a 10 minutes of idle time in Hypertable->KFS connection. After >>> the idle time Hypertable's KFS client is not able to reach KFS for 60 or 120 >>> seconds. (exact times). After that time everything works normally. >>> >>> >>> >>> Here's the logs from Hypertable part: >>> >>> >>> >>> 06-14-2009 05:09:09.402 INFO - (KfsWrite.cc:472) Write failed...chunk = 9, >>> version = >>> >>> 5, offset = 9724, error = -1001 >>> >>> 06-14-2009 05:09:09.402 INFO - (KfsWrite.cc:304) Daisy-chain: 127.0.0.1 >>> 30000 ; Will >>> >>> retry allocation/write on chunk 9 due to error code: -1001 >>> >>> 06-14-2009 05:10:09.473 INFO - (KfsWrite.cc:372) Forced allocation: chunk=9, >>> version=6 >>> >>> 06-14-2009 05:10:09.478 INFO - (KfsWrite.cc:472) Write failed...chunk = 7, >>> version = >>> >>> 8, offset = 2514, error = -1001 >>> >>> 06-14-2009 05:10:09.478 INFO - (KfsWrite.cc:304) Daisy-chain: 127.0.0.1 >>> 30000 ; Will >>> >>> retry allocation/write on chunk 7 due to error code: -1001 >>> >>> 06-14-2009 05:11:09.532 INFO - (KfsWrite.cc:372) Forced allocation: chunk=7, >>> version=9 >>> >>> >>> >>> And related KFS chunk logs: >>> >>> >>> >>> 06-14-2009 05:09:09.394 DEBUG - (ClientSM.cc:301) Got command: >>> write-id-alloc: seq = >>> >>> 955 chunkId = 9 chunkversion = 5 servers = >>> >>> >>> >>> 06-14-2009 05:09:09.394 INFO - (KfsOps.cc:1257) Write-id alloc: >>> write-id-alloc: seq = >>> >>> 955 chunkId = 9 chunkversion = 5 servers = >>> >>> 06-14-2009 05:09:09.401 INFO - (KfsOps.cc:1343) Sending write-id alloc back >>> (status = >>> >>> 0): write-id-alloc: seq = 955 chunkId = 9 chunkversion = 5 servers = >>> >>> 06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-id-alloc: seq = 955 chunkId = 9 chunkversion = 5 servers = : Response >>> status: 0 >>> >>> >>> >>> 06-14-2009 05:09:09.402 INFO - (KfsOps.cc:1392) Write prepare failed...as >>> lease >>> >>> expired for 9 >>> >>> 06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-prepare: seq = 956 chunkId = 9 chunkversion = 5 offset: 9724 numBytes: >>> 81: >>> >>> Response status: -1001 >>> >>> >>> >>> 06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:301) Got command: write-sync: >>> seq = 957 >>> >>> chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000 191 >>> >>> >>> >>> 06-14-2009 05:09:09.402 INFO - (ClientSM.cc:308) Received write sync from >>> 127.0.0.1: >>> >>> write-sync: seq = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 >>> 30000 191 >>> >>> 06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:331) Write-sync is being pushed >>> down; no >>> >>> writes left... (0 ops left in q) >>> >>> 06-14-2009 05:09:09.402 INFO - (KfsOps.cc:1527) Executing write sync: >>> write-sync: seq >>> >>> = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 30000 191 >>> >>> 06-14-2009 05:09:09.402 INFO - (KfsOps.cc:1574) Write sync failed...lease >>> expired for 9 >>> >>> 06-14-2009 05:09:09.402 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-sync: seq = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 >>> 30000 >>> >>> 191: Response status: -1001 >>> >>> >>> >>> 06-14-2009 05:09:09.402 INFO - (ClientSM.cc:95) Ack'ing write sync to >>> 127.0.0.1: >>> >>> write-sync: seq = 957 chunkId = 9 chunkversion = 5 write-id info: 127.0.0.1 >>> 30000 191 >>> >>> 06-14-2009 05:09:19.404 INFO - (LeaseClerk.cc:250) sending lease renew for: >>> chunk=9, >>> >>> lease=12, lease valid=-167 secs >>> >>> 06-14-2009 05:09:19.405 DEBUG - (LeaseClerk.cc:89) Lease for chunk = 9 >>> unregistered >>> >>> 06-14-2009 05:09:29.406 DEBUG - (ChunkManager.cc:1785) Dir: >>> /KFS/chunk/bin/kfschunk1 >>> >>> has space 16835831292 >>> >>> 06-14-2009 05:09:29.406 DEBUG - (ChunkManager.cc:1785) Dir: >>> /KFS/chunk/bin/kfschunk2 >>> >>> has space 16835824245 >>> >>> 06-14-2009 05:10:09.467 INFO - (ChunkManager.cc:444) Chunk >>> >>> /KFS/chunk/bin/kfschunk1/58.9.5 already exists; changing version # to 6; >>> Checksums >>> >>> are loaded >>> >>> 06-14-2009 05:10:09.467 DEBUG - (LeaseClerk.cc:78) Registered lease: >>> chunk=9, lease=15 >>> >>> 06-14-2009 05:10:09.472 DEBUG - (NetConnection.cc:56) Read 0 >>> bytes...connection dropped >>> >>> 06-14-2009 05:10:09.472 INFO - (ClientSM.cc:176) Closing connection from >>> client 127.0.0.1 >>> >>> 06-14-2009 05:10:09.473 DEBUG - (ClientSM.cc:301) Got command: size: chunkId >>> = 9 >>> >>> chunkversion = 6 >>> >>> >>> >>> 06-14-2009 05:10:09.473 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> size: >>> >>> chunkId = 9 chunkversion = 6: Response status: 0 >>> >>> >>> >>> 06-14-2009 05:10:09.473 DEBUG - (ClientSM.cc:301) Got command: >>> write-id-alloc: seq = >>> >>> 960 chunkId = 9 chunkversion = 6 servers = >>> >>> 06-14-2009 05:10:09.474 INFO - (KfsOps.cc:1457) Writing to chunk=9, >>> @offset=9724, >>> >>> nbytes=81, checksum=3524664815 >>> >>> 06-14-2009 05:10:09.474 DEBUG - (ChunkManager.cc:823) Write triggered a read >>> for >>> >>> offset = 9724 >>> >>> 06-14-2009 05:10:09.474 DEBUG - (ClientSM.cc:301) Got command: write-sync: >>> seq = 962 >>> >>> chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 192 >>> >>> >>> >>> 06-14-2009 05:10:09.474 INFO - (ClientSM.cc:308) Received write sync from >>> 127.0.0.1: >>> >>> write-sync: seq = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 >>> 30000 192 >>> >>> 06-14-2009 05:10:09.474 DEBUG - (ClientSM.cc:326) Keeping write-sync (962) >>> pending >>> >>> and depends on 961 >>> >>> 06-14-2009 05:10:09.475 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-prepare: seq = 961 chunkId = 9 chunkversion = 6 offset: 9724 numBytes: >>> 81: >>> >>> Response status: 0 >>> >>> >>> >>> 06-14-2009 05:10:09.475 DEBUG - (ClientSM.cc:363) Submitting write-sync >>> (962) since >>> >>> 961 finished >>> >>> 06-14-2009 05:10:09.475 INFO - (KfsOps.cc:1527) Executing write sync: >>> write-sync: seq >>> >>> = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 192 >>> >>> 06-14-2009 05:10:09.475 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-sync: seq = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 >>> 30000 >>> >>> 192: Response status: 0 >>> >>> >>> >>> 06-14-2009 05:10:09.475 INFO - (ClientSM.cc:95) Ack'ing write sync to >>> 127.0.0.1: >>> >>> write-sync: seq = 962 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 >>> 30000 192 >>> >>> 06-14-2009 05:10:09.477 DEBUG - (ClientSM.cc:301) Got command: >>> write-id-alloc: seq = >>> >>> 963 chunkId = 7 chunkversion = 8 servers = >>> >>> >>> >>> 06-14-2009 05:10:09.477 INFO - (KfsOps.cc:1257) Write-id alloc: >>> write-id-alloc: seq = >>> >>> 963 chunkId = 7 chunkversion = 8 servers = >>> >>> 06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1343) Sending write-id alloc back >>> (status = >>> >>> 0): write-id-alloc: seq = 963 chunkId = 7 chunkversion = 8 servers = >>> >>> 06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-id-alloc: seq = 963 chunkId = 7 chunkversion = 8 servers = : Response >>> status: 0 >>> >>> >>> >>> 06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1392) Write prepare failed...as >>> lease >>> >>> expired for 7 >>> >>> 06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-prepare: seq = 964 chunkId = 7 chunkversion = 8 offset: 2514 numBytes: >>> 30: >>> >>> Response status: -1001 >>> >>> >>> >>> 06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:301) Got command: write-sync: >>> seq = 965 >>> >>> chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000 193 >>> >>> >>> >>> 06-14-2009 05:10:09.478 INFO - (ClientSM.cc:308) Received write sync from >>> 127.0.0.1: >>> >>> write-sync: seq = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 >>> 30000 193 >>> >>> 06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:331) Write-sync is being pushed >>> down; no >>> >>> writes left... (0 ops left in q) >>> >>> 06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1527) Executing write sync: >>> write-sync: seq >>> >>> = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 30000 193 >>> >>> 06-14-2009 05:10:09.478 INFO - (KfsOps.cc:1574) Write sync failed...lease >>> expired for 7 >>> >>> 06-14-2009 05:10:09.478 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-sync: seq = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 >>> 30000 >>> >>> 193: Response status: -1001 >>> >>> >>> >>> 06-14-2009 05:10:09.478 INFO - (ClientSM.cc:95) Ack'ing write sync to >>> 127.0.0.1: >>> >>> write-sync: seq = 965 chunkId = 7 chunkversion = 8 write-id info: 127.0.0.1 >>> 30000 193 >>> >>> 06-14-2009 05:10:19.472 INFO - (LeaseClerk.cc:250) sending lease renew for: >>> chunk=7, >>> >>> lease=14, lease valid=-13 secs >>> >>> 06-14-2009 05:10:19.473 DEBUG - (LeaseClerk.cc:89) Lease for chunk = 7 >>> unregistered >>> >>> 06-14-2009 05:10:29.475 DEBUG - (ChunkManager.cc:1785) Dir: >>> /KFS/chunk/bin/kfschunk1 >>> >>> has space 16835835469 >>> >>> 06-14-2009 05:10:29.475 DEBUG - (ChunkManager.cc:1785) Dir: >>> /KFS/chunk/bin/kfschunk2 >>> >>> has space 16835828341 >>> >>> 06-14-2009 05:11:09.530 INFO - (ChunkManager.cc:444) Chunk >>> >>> /KFS/chunk/bin/kfschunk2/43.7.8 already exists; changing version # to 9; >>> Checksums >>> >>> are loaded >>> >>> 06-14-2009 05:11:09.530 DEBUG - (LeaseClerk.cc:78) Registered lease: >>> chunk=7, lease=16 >>> >>> 06-14-2009 05:11:09.532 DEBUG - (NetConnection.cc:56) Read 0 >>> bytes...connection dropped >>> >>> 06-14-2009 05:11:09.532 INFO - (ClientSM.cc:176) Closing connection from >>> client 127.0.0.1 >>> >>> 06-14-2009 05:11:09.532 DEBUG - (ClientSM.cc:301) Got command: size: chunkId >>> = 7 >>> >>> chunkversion = 9 >>> >>> >>> >>> 06-14-2009 05:11:09.532 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> size: >>> >>> chunkId = 7 chunkversion = 9: Response status: 0 >>> >>> 06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:301) Got command: >>> write-id-alloc: seq = >>> >>> 968 chunkId = 7 chunkversion = 9 servers = >>> >>> >>> >>> 06-14-2009 05:11:09.533 INFO - (KfsOps.cc:1257) Write-id alloc: >>> write-id-alloc: seq = >>> >>> 968 chunkId = 7 chunkversion = 9 servers = >>> >>> 06-14-2009 05:11:09.533 INFO - (KfsOps.cc:1343) Sending write-id alloc back >>> (status = >>> >>> 0): write-id-alloc: seq = 968 chunkId = 7 chunkversion = 9 servers = >>> >>> 06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-id-alloc: seq = 968 chunkId = 7 chunkversion = 9 servers = : Response >>> status: 0 >>> >>> >>> >>> 06-14-2009 05:11:09.533 INFO - (KfsOps.cc:1457) Writing to chunk=7, >>> @offset=2514, >>> >>> nbytes=30, checksum=2165245449 >>> >>> 06-14-2009 05:11:09.533 DEBUG - (ChunkManager.cc:823) Write triggered a read >>> for >>> >>> offset = 2514 >>> >>> 06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:301) Got command: write-sync: >>> seq = 970 >>> >>> chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000 194 >>> >>> >>> >>> 06-14-2009 05:11:09.533 INFO - (ClientSM.cc:308) Received write sync from >>> 127.0.0.1: >>> >>> write-sync: seq = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 >>> 30000 194 >>> >>> 06-14-2009 05:11:09.533 DEBUG - (ClientSM.cc:326) Keeping write-sync (970) >>> pending >>> >>> and depends on 969 >>> >>> 06-14-2009 05:11:09.534 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-prepare: seq = 969 chunkId = 7 chunkversion = 9 offset: 2514 numBytes: >>> 30: >>> >>> Response status: 0 >>> >>> >>> >>> 06-14-2009 05:11:09.534 DEBUG - (ClientSM.cc:363) Submitting write-sync >>> (970) since >>> >>> 969 finished >>> >>> 06-14-2009 05:11:09.534 INFO - (KfsOps.cc:1527) Executing write sync: >>> write-sync: seq >>> >>> = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 30000 194 >>> >>> 06-14-2009 05:11:09.534 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-sync: seq = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 >>> 30000 >>> >>> 194: Response status: 0 >>> >>> >>> >>> 06-14-2009 05:11:09.534 INFO - (ClientSM.cc:95) Ack'ing write sync to >>> 127.0.0.1: >>> >>> write-sync: seq = 970 chunkId = 7 chunkversion = 9 write-id info: 127.0.0.1 >>> 30000 194 >>> >>> 06-14-2009 05:11:09.537 DEBUG - (ClientSM.cc:301) Got command: >>> write-id-alloc: seq = >>> >>> 971 chunkId = 9 chunkversion = 6 servers = >>> >>> >>> >>> 06-14-2009 05:11:09.537 INFO - (KfsOps.cc:1257) Write-id alloc: >>> write-id-alloc: seq = >>> >>> 971 chunkId = 9 chunkversion = 6 servers = >>> >>> 06-14-2009 05:11:09.537 INFO - (KfsOps.cc:1343) Sending write-id alloc back >>> (status = >>> >>> 0): write-id-alloc: seq = 971 chunkId = 9 chunkversion = 6 servers = >>> >>> 06-14-2009 05:11:09.537 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-id-alloc: seq = 971 chunkId = 9 chunkversion = 6 servers = : Response >>> status: 0 >>> >>> >>> >>> 06-14-2009 05:11:09.537 INFO - (KfsOps.cc:1457) Writing to chunk=9, >>> @offset=9805, >>> >>> nbytes=80, checksum=1670256541 >>> >>> 06-14-2009 05:11:09.537 DEBUG - (ChunkManager.cc:823) Write triggered a read >>> for >>> >>> offset = 9805 >>> >>> 06-14-2009 05:11:09.548 DEBUG - (ClientSM.cc:301) Got command: write-sync: >>> seq = 973 >>> >>> chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 195 >>> >>> >>> >>> 06-14-2009 05:11:09.548 INFO - (ClientSM.cc:308) Received write sync from >>> 127.0.0.1: >>> >>> write-sync: seq = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 >>> 30000 195 >>> >>> 06-14-2009 05:11:09.548 DEBUG - (ClientSM.cc:326) Keeping write-sync (973) >>> pending >>> >>> and depends on 972 >>> >>> 06-14-2009 05:11:09.551 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-prepare: seq = 972 chunkId = 9 chunkversion = 6 offset: 9805 numBytes: >>> 80: >>> >>> Response status: 0 >>> >>> >>> >>> 06-14-2009 05:11:09.551 DEBUG - (ClientSM.cc:363) Submitting write-sync >>> (973) since >>> >>> 972 finished >>> >>> 06-14-2009 05:11:09.551 INFO - (KfsOps.cc:1527) Executing write sync: >>> write-sync: seq >>> >>> = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 30000 195 >>> >>> 06-14-2009 05:11:09.551 DEBUG - (ClientSM.cc:87) Client 127.0.0.1, Command >>> >>> write-sync: seq = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 >>> 30000 >>> >>> 195: Response status: 0 >>> >>> >>> >>> 06-14-2009 05:11:09.551 INFO - (ClientSM.cc:95) Ack'ing write sync to >>> 127.0.0.1: >>> >>> write-sync: seq = 973 chunkId = 9 chunkversion = 6 write-id info: 127.0.0.1 >>> 30000 195 >>> >>> >>> >>> (sorry for language mistakes.) >>> >>> >>> >>> Ufuk. >>> >>> __________ Information from ESET NOD32 Antivirus, version of virus signature >>> database 4168 (20090618) __________ >>> >>> The message was checked by ESET NOD32 Antivirus. >>> >>> http://www.eset.com >>> >>> ------------------------------------------------------------------------------ >>> Crystal Reports - New Free Runtime and 30 Day Trial >>> Check out the new simplified licensing option that enables unlimited >>> royalty-free distribution of the report engine for externally facing >>> server and web deployment. >>> http://p.sf.net/sfu/businessobjects >>> _______________________________________________ >>> Kosmosfs-users mailing list >>> Kos...@li... >>> https://lists.sourceforge.net/lists/listinfo/kosmosfs-users >>> >>> >> >> ------------------------------------------------------------------------------ >> Crystal Reports - New Free Runtime and 30 Day Trial >> Check out the new simplified licensing option that enables unlimited >> royalty-free distribution of the report engine for externally facing >> server and web deployment. >> http://p.sf.net/sfu/businessobjects >> _______________________________________________ >> Kosmosfs-users mailing list >> Kos...@li... >> https://lists.sourceforge.net/lists/listinfo/kosmosfs-users >> > |