From: Steve W. <st...@pu...> - 2011-03-03 13:24:59
|
Hi Michal, Thanks for the help! We found that the problematic file was a .history file used by tcsh. The user was running a large number of scripts which spawned shells to run jobs with each one attempting to read/write the same .history file. He now modified his startup script to only access the .history file for interactive shells and this has eliminated almost all of these messages in the logs. Steve On 03/02/2011 06:07 AM, Michal Borychowski wrote: > Hi Steve! > > Regarding your first question this means "CHUNK_LOCKED" - it may happen when > several clients try to write to the same file. The message doesn't mean > anything bad but it is better to avoid such situations. > > Regarding your second question please have a look here: > http://www.moosefs.org/moosefs-faq.html#error-messages and the two following > answers. > > > Best regards > Michal > > > > > -----Original Message----- > From: Steve Wilson [mailto:st...@pu...] > Sent: Tuesday, March 01, 2011 3:07 PM > To: moo...@li... > Subject: Re: [Moosefs-users] fs_writechunk returns status 11 > > On 03/01/2011 08:42 AM, Steve Wilson wrote: >> Hi, >> >> For a couple peak periods of time this morning, my logs were full of >> messages like the following: >> Mar 1 07:49:49 stanley mfsmount[1253]: file: 278364, index: 0 - >> fs_writechunk returns status 11 >> Mar 1 07:49:49 stanley mfsmount[1253]: file: 278364, index: 1 - >> fs_writechunk returns status 11 >> >> They all reference the same file: 278364. Does anyone know what this >> means? And, if so, what I should do about it? >> >> Thanks! >> >> Steve >> > A further look showed up a handful of other files but 99% of the > messages refer to that one file. > > Additionally, I saw several messages like the following from last > night's logs: > Mar 1 00:35:03 stanley mfsmount[1253]: file: 161779, index: 0, > chunk: 310854, version: 1 - writeworker: connection with (80D2302F:9422) > was timed out (unfinished writes: 5; try counter: 1) > Mar 1 00:35:05 stanley mfsmount[1253]: file: 161779, index: 0, > chunk: 310854, version: 1 - writeworker: connection with (80D2302F:9422) > was timed out (unfinished writes: 5; try counter: 1) > > Thanks, > Steve > > > ---------------------------------------------------------------------------- > -- > Free Software Download: Index, Search& Analyze Logs and other IT data in > Real-Time with Splunk. Collect, index and harness all the fast moving IT > data > generated by your applications, servers and devices whether physical, > virtual > or in the cloud. Deliver compliance at lower cost and gain new business > insights. http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- Steven M. Wilson, Systems and Network Manager Markey Center for Structural Biology Purdue University (765) 496-1946 |