From: Yaroslav H. <yo...@ps...> - 2004-02-27 02:18:09
|
That is a pit but NMI watchdog didn't work on my machine... Following the directions of Dmitry Katsubo <dma_k@ma...> I did next recearch (just a sketch now - details below) - compiled 2.4.24 vanilla kernel with om patch from http://mcaserta.com/openmosix/ (log of that is in http://www.onerussian.com/Linux/bugs/2.4.24-om.bug/protocol.2.4.24-om-compilation (this is log from the screen program with ANSI make menuconfig run so there is a nasty part as well in this log...) - rebooted - succeded to compile the same (not patched with om) 2.4.24 kernel with -j1 make switch. I did that couple of times to reproduce the problem if it exists http://www.onerussian.com/Linux/bugs/2.4.24-om.bug/kernel-compile.good-om-j1 - failed to compile it with make -j4 which was specified for make-kpkg in its CONCURRENCY_LEVEL := 4 semop(1): encountered an error: Invalid argument full log is at http://www.onerussian.com/Linux/bugs/2.4.24-om.bug/kernel-compile.bad-om-j4 - rebooted into 2.4.24 with no mosix support and succeded to compile the same kernel with -j4. http://www.onerussian.com/Linux/bugs/2.4.24-om.bug/kernel-compile.good.pure2.24-j4 These logs as well as dmesg after booting openmosix kernel with nmi_watchdog=1 is available from http://www.onerussian.com/Linux/bugs/2.4.24-om.bug/ Please tell me what else information can I provide to help locate the problem? Sincerely Yarik On Wed, Feb 25, 2004 at 11:37:28PM +0100, Alexander Nyberg wrote: > On Thu, 2004-02-19 at 03:45, Yaroslav Halchenko wrote: > > Dear developers, > > I have a couple of dual CPU opteron systems on which I've tried to > > experiement with recent om patches (I was running 2.4.21 for a while) > > I've decided to try 2.4.24 version of mosix (2.4.25-rc2 patch didn't work - > > nothing migrated - so I decided to try 2.4.24). > > >From first sight > > everything was fine. So I left 4 dummy tasks (yes >/dev/null) to run > > overnight ... it crashed - hanged with no signes of life and no messages > > in the logs > > I decided to try luck once more and have got system frozen in the middle > > of kernel compilation with make error message: > > semop(1): encountered an error: Invalid argument > > So I went back and got recent patch for 2.4.25 -- again - no migration > > at all. And I couldn't force it to migrate with mosrun because I have > > quite old user space utilities which are packaged for debian. > > Any ideas what can be wrong with 2.4.25 version? should I compile > > userspace utilities to provide you with more meaningfull bug report? > > Thank you in advance for any hints > Regarding the .24 freeze, I think the most helpful thing you could do is > compile in either sysrq support or if that doesn't work, NMI watchdog. > Both sysrq and NMI watchdog are very simple to set up, please check > Documentation/ for info on them. They will most likely provide a > backtrace to what happend at the freeze. > Without a trace of what's going on it's nearly impossible to fix it. > Alex -- Yaroslav Halchenko Research Assistant, Psychology Department, Rutgers-Newark Office (973) 353-5440 x263 FWD: 82823 Student Ph.D. @ CS Dept. NJIT, Master @ CS Dept. UNM lynx -source http://www.onerussian.com/gpg-yoh.asc | gpg --import GPG fingerprint 3BB6 E124 0643 A615 6F00 6854 8D11 4563 75C0 24C8 |