[SSI-devel] Re: [SSI-users] hangs on write()
Brought to you by:
brucewalker,
rogertsang
From: Roger T. <rog...@gm...> - 2005-09-17 18:38:23
|
Alright I can reproduce this by doing the large file copy. It gets stuck=20 here... Stack traceback for pid 136777 0xc57f3a80 136777 136733 0 0 D 0xc57f3c40 mc EBP EIP Function (args) 0xd8e01cc0 0xc03b4103 schedule+0x2b3 0xd8e01cc8 0xc03b462e io_schedule+0xe (0xc15001b0) 0xd8e01cd4 0xc0136745 sync_page+0x35 (0xc1251ea8, 0x0, 0xc0136710,=20 0xc57f3a80, 0xd8e01d24) 0xd8e01cf4 0xc03b49a9 __wait_on_bit_lock+0x49 (0x2, 0xc1251ea8, 0xc1251ea8,= =20 0x0, 0x0) 0xd8e01d50 0xc0136f5a __lock_page+0x8a (0xd666c8a0, 0x1d838, 0xda0e96a0,=20 0x1d838, 0x2) 0xd8e01de8 0xc013764b do_generic_mapping_read+0x3db (0xd666c8a0, 0xda0e96e8= ,=20 0xda0e96a0, 0xd8e01f14, 0xd8e01e1c) 0xd8e01e38 0xc0137b24 __generic_file_aio_read+0x194 (0xd8e01ed8, 0xd8e01e50= ,=20 0x1, 0xd8e01f14, 0x8135998) 0xd8e01e64 0xc0137be2 generic_file_aio_read+0x52 (0xd8e01ed8, 0x8135998,=20 0x2000, 0x1d838000, 0x0) 0xd8e01ea0 0xc0268f40 __cfs_file_read+0xc0 (0xd8e01ed8, 0x0, 0x8135998,=20 0x2000, 0xd8e01ed0) 0xd8e01ebc 0xc0268ffe cfs_file_aio_read+0x2e (0xd8e01ed8, 0x8135998, 0x2000= ,=20 0x1d838000, 0x0) 0xd8e01f64 0xc015561b do_sync_read+0xab (0xda0e96a0, 0x8135998, 0x2000,=20 0xd8e01fa8, 0x0) 0xd8e01f90 0xc0155758 vfs_read+0xe8 (0xda0e96a0, 0x8135998, 0x2000,=20 0xd8e01fa8, 0x1d838000) 0xd8e01fbc 0xc0155a1b sys_read+0x4b 0xc0103c55 sysenter_past_esp+0x52 On 9/17/05, Roger Tsang <rog...@gm...> wrote: >=20 > Okay I ran into this hang just a moment ago while copying a very large=20 > file from node 2 to the init node. It hangs at the very end of the file.= =20 > Then if I do "sync" as you have suggested, the copy completes. I guess ne= xt=20 > time I see this I'll do a backtrace on the copy process. My guess is it's= =20 > probably waiting in CFS wait_for_congestion(). >=20 > Have you tried a different IO scheduler? Try deadline if you were using= =20 > cfq. >=20 > Roger >=20 >=20 > On 8/25/05, John Byrne <joh...@hp...> wrote: > >=20 > > Andy Phillips wrote: > > > Following on; > > > > > > It appears that if I remount the file system with the > > > "sync" option then this problem goes away. But performance > > > is bad. Shutting down the other node in the cluster does=20 > > > not seem to affect this at all. > > > > > > Would SSI or the CFS cause issues with async i/o? Would > > > that follow a different path to a normal kernel? > > > > > > Andy > > > > >=20 > > There can certainly be bugs and I do note that your hanging is rather= =20 > > large. Maybe that is the cause of the problem. Maybe you could make a > > simple test case with 256k writes and see if that hangs. > >=20 > > John > >=20 > >=20 > > ------------------------------------------------------- > > SF.Net email is Sponsored by the Better Software Conference & EXPO > > September 19-22, 2005 * San Francisco, CA * Development Lifecycle=20 > > Practices > > Agile & Plan-Driven Development * Managing Projects & Teams * Testing &= =20 > > QA=20 > > Security * Process Improvement & Measurement *=20 > > http://www.sqe.com/bsce5sf > > _______________________________________________ > > Ssic-linux-users mailing list > > Ssi...@li... > > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > >=20 >=20 > |