|
From: Thomas M. <th...@m3...> - 2013-11-05 20:41:05
|
Hi, I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" command seems to get stuck after a while/few minutes. Any ideas what's going one here? How to debug this? It looks like the process running yum is in state ptrace stopped, but doesn't continue. The process tree looks also strange: 20330 pts/3 S+ 1:18 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20337 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20338 pts/3 S+ 0:03 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20339 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20347 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20405 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20469 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20615 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #1 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeiW6d5k 20625 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeiW6d5k 20626 ? Zs 0:00 | \_ [linux] <defunct> 20630 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20642 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20650 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20651 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20663 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20681 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20684 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20690 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20691 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20699 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20709 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20722 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20754 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #2 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipetxRIbS 20757 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipetxRIbS 20755 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #6 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipedhXmGp 20762 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipedhXmGp 20758 ? Zs 0:00 | \_ [linux] <defunct> 20760 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20763 ? Zs 0:00 | \_ [linux] <defunct> 20797 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #3 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeULItXd 20812 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeULItXd 20813 ? Zs 0:00 | \_ [linux] <defunct> 20815 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #5 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeaKUbD3 20876 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeaKUbD3 20877 ? Zs 0:00 | \_ [linux] <defunct> 20896 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 20909 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21005 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21007 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> 21019 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21112 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21125 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> 22164 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 22211 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 22224 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 22380 pts/3 t+ 0:51 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21965 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21968 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 21983 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 22053 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 22058 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 22887 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 with kind regards thomas |
|
From: Richard W. <ric...@gm...> - 2013-11-06 12:40:57
|
On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: > Hi, > > I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" > command seems to get stuck after a while/few minutes. > > Any ideas what's going one here? How to debug this? > > It looks like the process running yum is in state ptrace stopped, but > doesn't continue. Got only yum stuck or the whole UML kernel? Does yum a ptrace() within UML or did you observe that from the outside? > The process tree looks also strange: > > 20330 pts/3 S+ 1:18 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20337 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20338 pts/3 S+ 0:03 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20339 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20347 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20405 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20469 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20615 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #1 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeiW6d5k > 20625 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeiW6d5k > 20626 ? Zs 0:00 | \_ [linux] <defunct> > 20630 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20642 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20650 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20651 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20663 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20681 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20684 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20690 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20691 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20699 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20709 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20722 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20754 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #2 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipetxRIbS > 20757 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipetxRIbS > 20755 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #6 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipedhXmGp > 20762 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipedhXmGp > 20758 ? Zs 0:00 | \_ [linux] <defunct> > 20760 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20763 ? Zs 0:00 | \_ [linux] <defunct> > 20797 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #3 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeULItXd > 20812 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeULItXd > 20813 ? Zs 0:00 | \_ [linux] <defunct> > 20815 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #5 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeaKUbD3 > 20876 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeaKUbD3 > 20877 ? Zs 0:00 | \_ [linux] <defunct> > 20896 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 20909 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21005 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21007 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> > 21019 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21112 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21125 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> > 22164 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 22211 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 22224 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 22380 pts/3 t+ 0:51 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21965 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21968 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 21983 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 22053 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 22058 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > 22887 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 Remain the tasks in state Z or are they fipping around? Maybe the UML userspace creates many threads and on the host side UML didn't call wait() jet... > > > ------------------------------------------------------------------------------ > November Webinars for C, C++, Fortran Developers > Accelerate application performance with scalable programming models. Explore > techniques for threading, error checking, porting, and tuning. Get the most > from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > _______________________________________________ > User-mode-linux-user mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user -- Thanks, //richard |
|
From: Thomas M. <th...@m3...> - 2013-11-06 19:52:53
|
Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger: > On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: > > Hi, > > > > I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" > > command seems to get stuck after a while/few minutes. > > > > Any ideas what's going one here? How to debug this? > > > > It looks like the process running yum is in state ptrace stopped, but > > doesn't continue. > > Got only yum stuck or the whole UML kernel? How to tell? It feels like the whole kernel got stuck. > Does yum a ptrace() within UML or did you observe that from the outside? I saw it from outside in below process listing. > > > The process tree looks also strange: > > > > 20330 pts/3 S+ 1:18 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20337 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20338 pts/3 S+ 0:03 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20339 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20347 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20405 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20469 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20615 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #1 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeiW6d5k > > 20625 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeiW6d5k > > 20626 ? Zs 0:00 | \_ [linux] <defunct> > > 20630 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20642 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20650 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20651 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20663 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20681 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20684 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20690 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20691 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20699 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20709 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20722 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20754 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #2 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipetxRIbS > > 20757 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipetxRIbS > > 20755 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #6 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipedhXmGp > > 20762 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipedhXmGp > > 20758 ? Zs 0:00 | \_ [linux] <defunct> > > 20760 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20763 ? Zs 0:00 | \_ [linux] <defunct> > > 20797 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #3 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeULItXd > > 20812 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeULItXd > > 20813 ? Zs 0:00 | \_ [linux] <defunct> > > 20815 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #5 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeaKUbD3 > > 20876 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeaKUbD3 > > 20877 ? Zs 0:00 | \_ [linux] <defunct> > > 20896 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 20909 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21005 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21007 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> > > 21019 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21112 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21125 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> > > 22164 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 22211 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 22224 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 22380 pts/3 t+ 0:51 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21965 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21968 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 21983 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 22053 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 22058 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > 22887 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 > > Remain the tasks in state Z or are they fipping around? All process remain in thier state. nothing seems to happen any more. > Maybe the UML userspace creates many threads and on the host side UML > didn't call wait() jet... I don't think so. yum is probably not a big thread user, I guess. > > > > > > > ------------------------------------------------------------------------------ > > November Webinars for C, C++, Fortran Developers > > Accelerate application performance with scalable programming models. Explore > > techniques for threading, error checking, porting, and tuning. Get the most > > from the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > > _______________________________________________ > > User-mode-linux-user mailing list > > Use...@li... > > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user > > > |
|
From: Richard W. <ri...@no...> - 2013-11-06 19:59:10
|
Am 06.11.2013 20:52, schrieb Thomas Meyer: > Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger: >> On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: >>> Hi, >>> >>> I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" >>> command seems to get stuck after a while/few minutes. >>> >>> Any ideas what's going one here? How to debug this? >>> >>> It looks like the process running yum is in state ptrace stopped, but >>> doesn't continue. >> >> Got only yum stuck or the whole UML kernel? > > How to tell? It feels like the whole kernel got stuck. Login on another shell... :) >> Does yum a ptrace() within UML or did you observe that from the outside? > > I saw it from outside in below process listing. Okay. All UML childs do ptrace() as UML uses ptrace() for system call emulation. >> >>> The process tree looks also strange: >>> >>> 20330 pts/3 S+ 1:18 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20337 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20338 pts/3 S+ 0:03 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20339 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20347 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20405 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20469 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20615 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #1 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeiW6d5k >>> 20625 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeiW6d5k >>> 20626 ? Zs 0:00 | \_ [linux] <defunct> >>> 20630 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20642 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20650 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20651 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20663 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20681 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20684 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20690 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20691 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20699 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20709 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20722 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20754 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #2 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipetxRIbS >>> 20757 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipetxRIbS >>> 20755 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #6 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipedhXmGp >>> 20762 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipedhXmGp >>> 20758 ? Zs 0:00 | \_ [linux] <defunct> >>> 20760 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20763 ? Zs 0:00 | \_ [linux] <defunct> >>> 20797 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #3 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeULItXd >>> 20812 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeULItXd >>> 20813 ? Zs 0:00 | \_ [linux] <defunct> >>> 20815 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #5 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeaKUbD3 >>> 20876 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeaKUbD3 >>> 20877 ? Zs 0:00 | \_ [linux] <defunct> >>> 20896 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 20909 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21005 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21007 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> >>> 21019 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21112 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21125 pts/3 Z+ 0:00 | \_ [uml_net] <defunct> >>> 22164 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 22211 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 22224 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 22380 pts/3 t+ 0:51 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21965 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21968 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 21983 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 22053 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 22058 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >>> 22887 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20 >> >> Remain the tasks in state Z or are they fipping around? > > All process remain in thier state. nothing seems to happen any more. Ok. The it would be nice to find out what the UML main thread does. gdb can tell you. >> Maybe the UML userspace creates many threads and on the host side UML >> didn't call wait() jet... > > I don't think so. yum is probably not a big thread user, I guess. Isn't it a huge chunk of python? ;-) Thanks, //richard |
|
From: Thomas M. <th...@m3...> - 2013-11-06 20:31:42
|
Am Mittwoch, den 06.11.2013, 20:59 +0100 schrieb Richard Weinberger:
> Am 06.11.2013 20:52, schrieb Thomas Meyer:
> > Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger:
> >> On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote:
> >>> Hi,
> >>>
> >>> I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y"
> >>> command seems to get stuck after a while/few minutes.
> >>>
> >>> Any ideas what's going one here? How to debug this?
> >>>
> >>> It looks like the process running yum is in state ptrace stopped, but
> >>> doesn't continue.
> >>
> >> Got only yum stuck or the whole UML kernel?
> >
> > How to tell? It feels like the whole kernel got stuck.
>
> Login on another shell... :)
oops, yes, of course! I tried that and the getty was able to receive my
username, but after pressing enter nothing happened any more, some
timeout message appeared.
I logged in via mconsole and did an emergency sync and a halt command
and restarted uml. now I'm receiving funny warnings/bad pages:
systemd-udevd[599]: starting version 208
EXT4-fs (ubda): re-mounted. Opts: (null)
systemd-journald[338]: Received request to flush runtime journal from
PID 1
systemd-journald[338]:
File /var/log/journal/8e4cbfea404512ae70096c6202c9a3bf/system.journal
corrupted or uncleanly shut down, renaming and replacing.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at lib/timerqueue.c:74 timerqueue_del+0x57/0x60()
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.0-00048-gbe408cd #22
6036d968 60370947 00000000 00000000 00000009 60324f52 0000004a 6036d9b0
602972e5 6036da00 60030bc1 6036d9d0 90b153c0 6036da10 603816c0
6037e4e8
00000000 00000001 8dccbcf0 6036da10 60030d15 6036da30 601cf007
603816c0
Call Trace:
6036d9a8: [<602972e5>] dump_stack+0x17/0x19
6036d9b8: [<60030bc1>] warn_slowpath_common+0x71/0x90
6036da08: [<60030d15>] warn_slowpath_null+0x15/0x20
6036da18: [<601cf007>] timerqueue_del+0x57/0x60
6036da38: [<6004dfe6>] __remove_hrtimer+0x46/0xa0
6036da78: [<6004e548>] __hrtimer_start_range_ns+0xd8/0x1e0
6036dad8: [<6004e683>] hrtimer_start+0x13/0x20
6036dae8: [<60067e12>] __tick_nohz_idle_enter.constprop.28+0x2d2/0x360
6036db58: [<60068329>] tick_nohz_irq_exit+0x19/0x20
6036db68: [<60034f45>] irq_exit+0x85/0xa0
6036db78: [<6005dac8>] generic_handle_irq+0x28/0x30
6036db88: [<60014fa8>] do_IRQ+0x28/0x40
6036dba8: [<60016af0>] timer_handler+0x20/0x30
6036dbc8: [<6002834e>] real_alarm_handler+0x3e/0x50
6036dd38: [<60028b53>] os_nsecs+0x13/0x30
6036dd58: [<60028b53>] os_nsecs+0x13/0x30
6036dd78: [<60028a98>] timer_one_shot+0x68/0x80
6036dda8: [<60016acc>] itimer_next_event+0xc/0x10
6036ddb8: [<6006704a>] clockevents_program_event+0x6a/0xf0
6036dde8: [<6006780a>] tick_program_event+0x1a/0x20
6036ddf8: [<6004df9b>] hrtimer_force_reprogram+0x6b/0x70
6036de08: [<6004e03c>] __remove_hrtimer+0x9c/0xa0
6036de28: [<601cef40>] timerqueue_add+0x60/0xb0
6036de48: [<6004e514>] __hrtimer_start_range_ns+0xa4/0x1e0
6036dea8: [<6004e683>] hrtimer_start+0x13/0x20
6036deb8: [<6007c298>] rcu_sched_qs+0x78/0x90
6036dee8: [<60028224>] unblock_signals+0x64/0x80
6036df08: [<60015b0a>] arch_cpu_idle+0x3a/0x50
6036df18: [<6007c3cf>] rcu_idle_enter+0x6f/0xb0
6036df38: [<6005da4e>] cpu_startup_entry+0x8e/0xe0
6036df48: [<602991b9>] schedule_preempt_disabled+0x9/0x10
6036df58: [<60294228>] rest_init+0x68/0x70
6036df68: [<600027f5>] check_bugs+0xe/0x19
6036df78: [<60001635>] start_kernel+0x27f/0x286
6036df80: [<600011c5>] unknown_bootoption+0x0/0x185
6036dfb8: [<6000289a>] start_kernel_proc+0x31/0x35
6036dfd8: [<6001561a>] new_thread_handler+0x7a/0xa0
---[ end trace 41ecadffe5cf650c ]---
BUG: Bad page state in process systemd-journal pfn:2d25c
page:0000000062c97420 count:0 mapcount:1 mapping: (null)
index:0x2
page flags: 0x80008(uptodate|swapbacked)
Modules linked in:
CPU: 0 PID: 338 Comm: systemd-journal Tainted: G W
3.12.0-00048-gbe408cd #22
8dc759b8 60370947 60309985 62c97420 00000000 62c97458 60383ac0 8dc75a00
602972e5 8dc75a30 60082ce0 ffffffffff0a0210 8dc75adc 8dc75a50
62c97420 8dc75a60
60082df0 60309982 62c97420 00080008 00000000 8dc75aa0 60084561
00000000
Call Trace:
8dc759f8: [<602972e5>] dump_stack+0x17/0x19
8dc75a08: [<60082ce0>] bad_page+0xb0/0x100
8dc75a38: [<60082df0>] free_pages_prepare+0xc0/0xd0
8dc75a68: [<60084561>] free_hot_cold_page+0x21/0x130
8dc75aa8: [<60084ace>] free_hot_cold_page_list+0x3e/0x60
8dc75ad8: [<60087958>] release_pages+0x158/0x1a0
8dc75b38: [<60087a62>] pagevec_lru_move_fn+0xc2/0xe0
8dc75b50: [<60087120>] __pagevec_lru_add_fn+0x0/0xc0
8dc75b98: [<60087e91>] lru_add_drain_cpu+0x71/0x80
8dc75bb8: [<60087f6b>] lru_add_drain+0xb/0x10
8dc75bc8: [<600a0136>] exit_mmap+0x46/0x170
8dc75bf0: [<60018a50>] copy_chunk_to_user+0x0/0x30
8dc75c28: [<6002e2a7>] mmput.part.62+0x27/0xc0
8dc75c48: [<6002e359>] mmput+0x19/0x20
8dc75c58: [<6003117a>] exit_mm+0x10a/0x150
8dc75ca8: [<60031de1>] do_exit+0x331/0x4a0
8dc75ce8: [<600330de>] do_group_exit+0x3e/0xd0
8dc75d18: [<6003de7b>] get_signal_to_deliver+0x1bb/0x4d0
8dc75d48: [<600535cb>] wake_up_state+0xb/0x10
8dc75d88: [<60016757>] kern_do_signal+0x57/0x150
8dc75e08: [<60028460>] set_signals+0x30/0x40
8dc75e28: [<6003d175>] force_sig_info+0xb5/0xd0
8dc75e68: [<60016871>] do_signal+0x21/0x30
8dc75e88: [<60017d14>] fatal_sigsegv+0x24/0x30
8dc75ea8: [<6002b2d3>] userspace+0x2c3/0x4d0
8dc75f78: [<600273d7>] save_registers+0x17/0x30
8dc75f88: [<6002df20>] arch_prctl+0x150/0x180
8dc75fd8: [<600156a9>] fork_handler+0x69/0x70
Disabling lock debugging due to kernel taint
BUG: Bad page state in process systemd-journal pfn:2cc8c
page:0000000062c82ea0 count:0 mapcount:1 mapping: (null)
index:0x3
page flags: 0x80008(uptodate|swapbacked)
Modules linked in:
CPU: 0 PID: 338 Comm: systemd-journal Tainted: G B W
3.12.0-00048-gbe408cd #22
8dc759b8 60370947 60309985 62c82ea0 00000000 62c82ed8 60383ac0 8dc75a00
602972e5 8dc75a30 60082ce0 ffffffffff0a0210 8dc75adc 8dc75a50
62c82ea0 8dc75a60
60082df0 60309982 62c82ea0 00080008 00000000 8dc75aa0 60084561
00000000
Call Trace:
8dc759f8: [<602972e5>] dump_stack+0x17/0x19
8dc75a08: [<60082ce0>] bad_page+0xb0/0x100
8dc75a38: [<60082df0>] free_pages_prepare+0xc0/0xd0
8dc75a68: [<60084561>] free_hot_cold_page+0x21/0x130
8dc75aa8: [<60084ace>] free_hot_cold_page_list+0x3e/0x60
8dc75ad8: [<60087958>] release_pages+0x158/0x1a0
8dc75b38: [<60087a62>] pagevec_lru_move_fn+0xc2/0xe0
8dc75b50: [<60087120>] __pagevec_lru_add_fn+0x0/0xc0
8dc75b98: [<60087e91>] lru_add_drain_cpu+0x71/0x80
8dc75bb8: [<60087f6b>] lru_add_drain+0xb/0x10
8dc75bc8: [<600a0136>] exit_mmap+0x46/0x170
8dc75bf0: [<60018a50>] copy_chunk_to_user+0x0/0x30
8dc75c28: [<6002e2a7>] mmput.part.62+0x27/0xc0
8dc75c48: [<6002e359>] mmput+0x19/0x20
8dc75c58: [<6003117a>] exit_mm+0x10a/0x150
8dc75ca8: [<60031de1>] do_exit+0x331/0x4a0
8dc75ce8: [<600330de>] do_group_exit+0x3e/0xd0
8dc75d18: [<6003de7b>] get_signal_to_deliver+0x1bb/0x4d0
8dc75d48: [<600535cb>] wake_up_state+0xb/0x10
8dc75d88: [<60016757>] kern_do_signal+0x57/0x150
8dc75e08: [<60028460>] set_signals+0x30/0x40
8dc75e28: [<6003d175>] force_sig_info+0xb5/0xd0
8dc75e68: [<60016871>] do_signal+0x21/0x30
8dc75e88: [<60017d14>] fatal_sigsegv+0x24/0x30
8dc75ea8: [<6002b2d3>] userspace+0x2c3/0x4d0
8dc75f78: [<600273d7>] save_registers+0x17/0x30
8dc75f88: [<6002df20>] arch_prctl+0x150/0x180
8dc75fd8: [<600156a9>] fork_handler+0x69/0x70
BUG: failure at mm/slab.c:1813/kmem_freepages()!
Kernel panic - not syncing: BUG!
CPU: 0 PID: 338 Comm: systemd-journal Tainted: G B W
3.12.0-00048-gbe408cd #22
8dc75988 60370947 00000000 602ffc6e 8de9a000 9080cdf0 00000000 8dc759d0
602972e5 8dc75ad0 60294a4a 00000000 3000000008 8dc75ae0 8dc75a00
8dc75a30
60423e00 00000007 00000006 8dc75720 000000f7 00000715 602a4944
ffffffffffffffff
Call Trace:
8dc759c8: [<602972e5>] dump_stack+0x17/0x19
8dc759d8: [<60294a4a>] panic+0xf4/0x1e2
8dc75a68: [<60084619>] free_hot_cold_page+0xd9/0x130
8dc75aa8: [<60087362>] __put_single_page+0x22/0x30
8dc75ad8: [<600ae4a5>] kmem_freepages.isra.69+0x135/0x140
8dc75af8: [<600aedb9>] slab_destroy+0x29/0x60
8dc75b18: [<600aef2c>] free_block+0x13c/0x150
8dc75b48: [<60296210>] cache_flusharray+0x60/0x84
8dc75b78: [<600aed87>] kmem_cache_free+0xa7/0xb0
8dc75ba8: [<6009d1f5>] remove_vma+0x45/0x50
8dc75bc8: [<600a01b4>] exit_mmap+0xc4/0x170
8dc75c28: [<6002e2a7>] mmput.part.62+0x27/0xc0
8dc75c48: [<6002e359>] mmput+0x19/0x20
8dc75c58: [<6003117a>] exit_mm+0x10a/0x150
8dc75ca8: [<60031de1>] do_exit+0x331/0x4a0
8dc75ce8: [<600330de>] do_group_exit+0x3e/0xd0
8dc75d18: [<6003de7b>] get_signal_to_deliver+0x1bb/0x4d0
8dc75d48: [<600535cb>] wake_up_state+0xb/0x10
8dc75d88: [<60016757>] kern_do_signal+0x57/0x150
8dc75e08: [<60028460>] set_signals+0x30/0x40
8dc75e28: [<6003d175>] force_sig_info+0xb5/0xd0
8dc75e68: [<60016871>] do_signal+0x21/0x30
8dc75e88: [<60017d14>] fatal_sigsegv+0x24/0x30
8dc75ea8: [<6002b2d3>] userspace+0x2c3/0x4d0
8dc75f78: [<600273d7>] save_registers+0x17/0x30
8dc75f88: [<6002df20>] arch_prctl+0x150/0x180
8dc75fd8: [<600156a9>] fork_handler+0x69/0x70
Modules linked in:
Pid: 338, comm: systemd-journal Tainted: G B W
3.12.0-00048-gbe408cd
RIP: 0033:[<0000000041621463>]
RSP: 0000007fbfe99b28 EFLAGS: 00000246
RAX: 0000000000000001 RBX: 0000007fbfe99b40 RCX: ffffffffffffffff
RDX: 0000000000000001 RSI: 0000007fbfe99b30 RDI: 0000000000000007
RBP: 00000000ffffffff R08: 0000000000000001 R09: 000000552ace5943
R10: 00000000ffffffff R11: 0000000000000246 R12: 0000007fbfe99b30
R13: 00000000000003e8 R14: 0004ea87f4e0a28c R15: 0000000000000000
Call Trace:
8dc75968: [<6001837b>] panic_exit+0x2b/0x50
8dc75978: [<60016990>] show_stack+0x40/0xe0
8dc75988: [<6004fb2c>] notifier_call_chain+0x4c/0x70
8dc759c8: [<6004fb71>] atomic_notifier_call_chain+0x11/0x20
8dc759d8: [<60294a5b>] panic+0x105/0x1e2
8dc75a68: [<60084619>] free_hot_cold_page+0xd9/0x130
8dc75aa8: [<60087362>] __put_single_page+0x22/0x30
8dc75ad8: [<600ae4a5>] kmem_freepages.isra.69+0x135/0x140
8dc75af8: [<600aedb9>] slab_destroy+0x29/0x60
8dc75b18: [<600aef2c>] free_block+0x13c/0x150
8dc75b48: [<60296210>] cache_flusharray+0x60/0x84
8dc75b78: [<600aed87>] kmem_cache_free+0xa7/0xb0
8dc75ba8: [<6009d1f5>] remove_vma+0x45/0x50
8dc75bc8: [<600a01b4>] exit_mmap+0xc4/0x170
8dc75c28: [<6002e2a7>] mmput.part.62+0x27/0xc0
8dc75c48: [<6002e359>] mmput+0x19/0x20
8dc75c58: [<6003117a>] exit_mm+0x10a/0x150
8dc75ca8: [<60031de1>] do_exit+0x331/0x4a0
8dc75ce8: [<600330de>] do_group_exit+0x3e/0xd0
8dc75d18: [<6003de7b>] get_signal_to_deliver+0x1bb/0x4d0
8dc75d48: [<600535cb>] wake_up_state+0xb/0x10
8dc75d88: [<60016757>] kern_do_signal+0x57/0x150
8dc75e08: [<60028460>] set_signals+0x30/0x40
8dc75e28: [<6003d175>] force_sig_info+0xb5/0xd0
8dc75e68: [<60016871>] do_signal+0x21/0x30
8dc75e88: [<60017d14>] fatal_sigsegv+0x24/0x30
8dc75ea8: [<6002b2d3>] userspace+0x2c3/0x4d0
8dc75f78: [<600273d7>] save_registers+0x17/0x30
8dc75f88: [<6002df20>] arch_prctl+0x150/0x180
8dc75fd8: [<600156a9>] fork_handler+0x69/0x70
------------[ cut here ]------------
WARNING: CPU: 0 PID: 618 at lib/timerqueue.c:74 timerqueue_del
+0x57/0x60()
Modules linked in:
CPU: 0 PID: 618 Comm: plymouthd Not tainted 3.12.0-00048-gbe408cd #22
6036f5b8 60370947 6004fc41 00000000 00000009 60324f52 0000004a 6036f600
602972e5 6036f650 60030bc1 6036f830 6036f630 00000000 603816c0
6037e4e8
00000002 00000000 603816c0 6036f660 60030d15 6036f680 601cf007
603816c0
Call Trace:
6036f5c8: [<6004fc41>] raw_notifier_call_chain+0x11/0x20
6036f5f8: [<602972e5>] dump_stack+0x17/0x19
6036f608: [<60030bc1>] warn_slowpath_common+0x71/0x90
6036f658: [<60030d15>] warn_slowpath_null+0x15/0x20
6036f668: [<601cf007>] timerqueue_del+0x57/0x60
6036f688: [<6004dfe6>] __remove_hrtimer+0x46/0xa0
6036f6c8: [<6004e2fd>] __run_hrtimer.isra.35+0x2d/0xf0
6036f6e8: [<6004e7c7>] hrtimer_interrupt+0xb7/0x210
6036f738: [<60016a3f>] um_timer+0xf/0x20
6036f748: [<6005e0f1>] handle_irq_event_percpu+0x31/0x130
6036f798: [<6005e213>] handle_irq_event+0x23/0x40
6036f7b8: [<60060287>] handle_edge_irq+0x67/0x120
6036f7d8: [<6005dac8>] generic_handle_irq+0x28/0x30
6036f7e8: [<60014fa3>] do_IRQ+0x23/0x40
6036f808: [<60016af0>] timer_handler+0x20/0x30
6036f828: [<6002834e>] real_alarm_handler+0x3e/0x50
6036f838: [<6001a400>] winch_thread+0x0/0x1b0
6036f8b0: [<6003f612>] sigsuspend+0x22/0x90
6036fb48: [<6002840a>] alarm_handler+0x3a/0x50
6036fb68: [<60027f7d>] hard_handler+0x7d/0xc0
6036fba0: [<6001a400>] winch_thread+0x0/0x1b0
6036fc18: [<6001a400>] winch_thread+0x0/0x1b0
6036fc68: [<6003f612>] sigsuspend+0x22/0x90
---[ end trace 41ecadffe5cf650c ]---
>
> >> Does yum a ptrace() within UML or did you observe that from the outside?
> >
> > I saw it from outside in below process listing.
>
> Okay. All UML childs do ptrace() as UML uses ptrace() for system call emulation.
>
> >>
> >>> The process tree looks also strange:
> >>>
> >>> 20330 pts/3 S+ 1:18 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20337 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20338 pts/3 S+ 0:03 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20339 pts/3 S+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20347 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20405 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20469 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20615 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #1 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeiW6d5k
> >>> 20625 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeiW6d5k
> >>> 20626 ? Zs 0:00 | \_ [linux] <defunct>
> >>> 20630 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20642 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20650 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20651 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20663 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20681 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20684 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20690 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20691 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20699 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20709 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20722 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20754 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #2 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipetxRIbS
> >>> 20757 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipetxRIbS
> >>> 20755 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #6 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipedhXmGp
> >>> 20762 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipedhXmGp
> >>> 20758 ? Zs 0:00 | \_ [linux] <defunct>
> >>> 20760 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20763 ? Zs 0:00 | \_ [linux] <defunct>
> >>> 20797 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #3 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeULItXd
> >>> 20812 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeULItXd
> >>> 20813 ? Zs 0:00 | \_ [linux] <defunct>
> >>> 20815 pts/3 S+ 0:00 | \_ xterm -T Virtual Console #5 (fedora20) -e port-helper -uml-socket /tmp/xterm-pipeaKUbD3
> >>> 20876 ? Ss 0:00 | | \_ port-helper -uml-socket /tmp/xterm-pipeaKUbD3
> >>> 20877 ? Zs 0:00 | \_ [linux] <defunct>
> >>> 20896 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 20909 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21005 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21007 pts/3 Z+ 0:00 | \_ [uml_net] <defunct>
> >>> 21019 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21112 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21125 pts/3 Z+ 0:00 | \_ [uml_net] <defunct>
> >>> 22164 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 22211 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 22224 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 22380 pts/3 t+ 0:51 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21965 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21968 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 21983 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 22053 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 22058 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>> 22887 pts/3 t+ 0:00 | \_ ./linux ubd0=ext3fs.img mem=768M systemd.unit=multi-user.target umid=fedora20
> >>
> >> Remain the tasks in state Z or are they fipping around?
> >
> > All process remain in thier state. nothing seems to happen any more.
>
> Ok. The it would be nice to find out what the UML main thread does.
> gdb can tell you.
>
> >> Maybe the UML userspace creates many threads and on the host side UML
> >> didn't call wait() jet...
> >
> > I don't think so. yum is probably not a big thread user, I guess.
>
> Isn't it a huge chunk of python? ;-)
>
> Thanks,
> //richard
|
|
From: Thomas M. <th...@m3...> - 2014-03-14 15:29:42
|
Am Mittwoch, den 06.11.2013, 20:59 +0100 schrieb Richard Weinberger: > Am 06.11.2013 20:52, schrieb Thomas Meyer: > > Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger: > >> On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: > >>> Hi, > >>> > >>> I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" > >>> command seems to get stuck after a while/few minutes. > >>> > >>> Any ideas what's going one here? How to debug this? > >>> > >>> It looks like the process running yum is in state ptrace stopped, but > >>> doesn't continue. > >> > >> Got only yum stuck or the whole UML kernel? > > only some processes get stuck. After enabling hung task detection in the kernel I see this in the logs: [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120 seconds. [ 8040.100000] Not tainted 3.13.6 #24 [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 8040.100000] jbd2/ubda-8 D 0000000040aa1c83 0 308 2 0x00000000 [ 8040.100000] Stack: [ 8040.100000] 600795a0 603f3120 b066db70 6006476d [ 8040.100000] 1b066dbb0 b0631040 b066db80 6001c0a3 [ 8040.100000] 600795a0 b066b380 b066dbe0 60314b4e [ 8040.100000] Call Trace: [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 [ 8040.100000] [<6006aa70>] ? prepare_to_wait+0x0/0x90 [ 8040.100000] [<60314e43>] schedule+0x33/0x80 [ 8040.100000] [<6020b9d4>] ? submit_bio+0xa4/0x1c0 [ 8040.100000] [<60315030>] io_schedule+0x60/0x90 [ 8040.100000] [<6010b7b0>] sleep_on_buffer+0x10/0x20 [ 8040.100000] [<603153a3>] __wait_on_bit+0x63/0xa0 [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 [ 8040.100000] [<60315466>] out_of_line_wait_on_bit+0x86/0xa0 [ 8040.100000] [<6010e312>] ? submit_bh+0x12/0x20 [ 8040.100000] [<6006aed0>] ? wake_bit_function+0x0/0x40 [ 8040.100000] [<600645e3>] ? __might_sleep+0x153/0x170 [ 8040.100000] [<60064490>] ? __might_sleep+0x0/0x170 [ 8040.100000] [<6010b890>] ? __wait_on_buffer+0x0/0x40 [ 8040.100000] [<6010b8c4>] __wait_on_buffer+0x34/0x40 [ 8040.100000] [<6019a166>] jbd2_journal_commit_transaction +0x1696/0x19a0 [ 8040.100000] [<60315060>] ? _cond_resched+0x0/0x50 [ 8040.100000] [<6010d200>] ? __brelse+0x0/0x30 [ 8040.100000] [<600330c0>] ? block_signals+0x0/0x20 [ 8040.100000] [<6006abff>] ? finish_wait+0x6f/0x90 [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 [ 8040.100000] [<600487c0>] ? del_timer+0x0/0x60 [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 [ 8040.100000] [<6019dd5f>] kjournald2+0xdf/0x2d0 [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 [ 8040.100000] [<6005ce44>] kthread+0x114/0x140 [ 8040.100000] [<6006442d>] ? finish_task_switch.isra.72+0x2d/0x90 [ 8040.100000] [<600656a2>] ? schedule_tail+0x22/0xd0 [ 8040.100000] [<6001be21>] new_thread_handler+0x81/0xb0 [ 8040.100000] [ 8040.100000] INFO: task yum:1082 blocked for more than 120 seconds. [ 8040.100000] Not tainted 3.13.6 #24 [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 8040.100000] yum D 0000000040aa1c83 0 1082 1073 0x00000000 [ 8040.100000] Stack: [ 8040.100000] 600795a0 603f3120 aae3fc80 6006476d [ 8040.100000] 162b83078 b0631040 aae3fc90 6001c0a3 [ 8040.100000] 600795a0 aadeab40 aae3fcf0 60314b4e [ 8040.100000] Call Trace: [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 [ 8040.100000] [<6003344f>] ? set_signals+0x3f/0x50 [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 [ 8040.100000] [<60314e10>] ? schedule+0x0/0x80 [ 8040.100000] [<60314e43>] schedule+0x33/0x80 [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 [ 8040.100000] [<6019d3a3>] jbd2_log_wait_commit+0xa3/0x110 [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 [ 8040.100000] [<6019f228>] jbd2_complete_transaction+0x48/0x90 [ 8040.100000] [<6015162b>] ext4_sync_file+0x28b/0x320 [ 8040.100000] [<600dba2a>] ? vfs_read+0x13a/0x190 [ 8040.100000] [<6010a133>] do_fsync+0x53/0x80 [ 8040.100000] [<600db73d>] ? SyS_lseek+0x7d/0x90 [ 8040.100000] [<6010a442>] SyS_fsync+0x12/0x20 [ 8040.100000] [<6001fa68>] handle_syscall+0x68/0x90 [ 8040.100000] [<600370fd>] userspace+0x4fd/0x600 [ 8040.100000] [<60031e0f>] ? save_registers+0x1f/0x40 [ 8040.100000] [<6003a2c7>] ? arch_prctl+0x177/0x1b0 [ 8040.100000] [<6001bed5>] fork_handler+0x85/0x90 [ 8040.100000] any ideas? some synchronisation error in ext4? > > Thanks, > //richard |
|
From: Richard W. <ri...@no...> - 2014-03-16 15:10:08
|
Am 14.03.2014 15:57, schrieb Thomas Meyer: > Am Mittwoch, den 06.11.2013, 20:59 +0100 schrieb Richard Weinberger: >> Am 06.11.2013 20:52, schrieb Thomas Meyer: >>> Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger: >>>> On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: >>>>> Hi, >>>>> >>>>> I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" >>>>> command seems to get stuck after a while/few minutes. >>>>> >>>>> Any ideas what's going one here? How to debug this? >>>>> >>>>> It looks like the process running yum is in state ptrace stopped, but >>>>> doesn't continue. >>>> >>>> Got only yum stuck or the whole UML kernel? >>> > > only some processes get stuck. > > After enabling hung task detection in the kernel I see this in the logs: > > [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120 > seconds. > [ 8040.100000] Not tainted 3.13.6 #24 > [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 8040.100000] jbd2/ubda-8 D 0000000040aa1c83 0 308 2 > 0x00000000 > [ 8040.100000] Stack: > [ 8040.100000] 600795a0 603f3120 b066db70 6006476d > [ 8040.100000] 1b066dbb0 b0631040 b066db80 6001c0a3 > [ 8040.100000] 600795a0 b066b380 b066dbe0 60314b4e > [ 8040.100000] Call Trace: > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 > [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > [ 8040.100000] [<6006aa70>] ? prepare_to_wait+0x0/0x90 > [ 8040.100000] [<60314e43>] schedule+0x33/0x80 > [ 8040.100000] [<6020b9d4>] ? submit_bio+0xa4/0x1c0 > [ 8040.100000] [<60315030>] io_schedule+0x60/0x90 > [ 8040.100000] [<6010b7b0>] sleep_on_buffer+0x10/0x20 > [ 8040.100000] [<603153a3>] __wait_on_bit+0x63/0xa0 > [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 > [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 > [ 8040.100000] [<60315466>] out_of_line_wait_on_bit+0x86/0xa0 > [ 8040.100000] [<6010e312>] ? submit_bh+0x12/0x20 > [ 8040.100000] [<6006aed0>] ? wake_bit_function+0x0/0x40 > [ 8040.100000] [<600645e3>] ? __might_sleep+0x153/0x170 > [ 8040.100000] [<60064490>] ? __might_sleep+0x0/0x170 > [ 8040.100000] [<6010b890>] ? __wait_on_buffer+0x0/0x40 > [ 8040.100000] [<6010b8c4>] __wait_on_buffer+0x34/0x40 > [ 8040.100000] [<6019a166>] jbd2_journal_commit_transaction > +0x1696/0x19a0 > [ 8040.100000] [<60315060>] ? _cond_resched+0x0/0x50 > [ 8040.100000] [<6010d200>] ? __brelse+0x0/0x30 > [ 8040.100000] [<600330c0>] ? block_signals+0x0/0x20 > [ 8040.100000] [<6006abff>] ? finish_wait+0x6f/0x90 > [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 > [ 8040.100000] [<600487c0>] ? del_timer+0x0/0x60 > [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 > [ 8040.100000] [<6019dd5f>] kjournald2+0xdf/0x2d0 > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 > [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 > [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 > [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 > [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 > [ 8040.100000] [<6005ce44>] kthread+0x114/0x140 > [ 8040.100000] [<6006442d>] ? finish_task_switch.isra.72+0x2d/0x90 > [ 8040.100000] [<600656a2>] ? schedule_tail+0x22/0xd0 > [ 8040.100000] [<6001be21>] new_thread_handler+0x81/0xb0 > [ 8040.100000] > [ 8040.100000] INFO: task yum:1082 blocked for more than 120 seconds. > [ 8040.100000] Not tainted 3.13.6 #24 > [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 8040.100000] yum D 0000000040aa1c83 0 1082 1073 > 0x00000000 > [ 8040.100000] Stack: > [ 8040.100000] 600795a0 603f3120 aae3fc80 6006476d > [ 8040.100000] 162b83078 b0631040 aae3fc90 6001c0a3 > [ 8040.100000] 600795a0 aadeab40 aae3fcf0 60314b4e > [ 8040.100000] Call Trace: > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 > [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > [ 8040.100000] [<6003344f>] ? set_signals+0x3f/0x50 > [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 > [ 8040.100000] [<60314e10>] ? schedule+0x0/0x80 > [ 8040.100000] [<60314e43>] schedule+0x33/0x80 > [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 > [ 8040.100000] [<6019d3a3>] jbd2_log_wait_commit+0xa3/0x110 > [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 > [ 8040.100000] [<6019f228>] jbd2_complete_transaction+0x48/0x90 > [ 8040.100000] [<6015162b>] ext4_sync_file+0x28b/0x320 > [ 8040.100000] [<600dba2a>] ? vfs_read+0x13a/0x190 > [ 8040.100000] [<6010a133>] do_fsync+0x53/0x80 > [ 8040.100000] [<600db73d>] ? SyS_lseek+0x7d/0x90 > [ 8040.100000] [<6010a442>] SyS_fsync+0x12/0x20 > [ 8040.100000] [<6001fa68>] handle_syscall+0x68/0x90 > [ 8040.100000] [<600370fd>] userspace+0x4fd/0x600 > [ 8040.100000] [<60031e0f>] ? save_registers+0x1f/0x40 > [ 8040.100000] [<6003a2c7>] ? arch_prctl+0x177/0x1b0 > [ 8040.100000] [<6001bed5>] fork_handler+0x85/0x90 > [ 8040.100000] > > any ideas? some synchronisation error in ext4? Hmm, maybe you suffer from the same issue this patch tries to address: https://lkml.org/lkml/2014/2/14/733 Thanks, //richard |
|
From: Thomas M. <th...@m3...> - 2014-03-18 14:17:08
|
Am Sonntag, den 16.03.2014, 16:09 +0100 schrieb Richard Weinberger: > > Am 14.03.2014 15:57, schrieb Thomas Meyer: > > Am Mittwoch, den 06.11.2013, 20:59 +0100 schrieb Richard Weinberger: > >> Am 06.11.2013 20:52, schrieb Thomas Meyer: > >>> Am Mittwoch, den 06.11.2013, 13:40 +0100 schrieb Richard Weinberger: > >>>> On Tue, Nov 5, 2013 at 9:21 PM, Thomas Meyer <th...@m3...> wrote: > >>>>> Hi, > >>>>> > >>>>> I'm running Fedora 20 inside a 3.12 UML kernel and the "yum upgrade -y" > >>>>> command seems to get stuck after a while/few minutes. > >>>>> > >>>>> Any ideas what's going one here? How to debug this? > >>>>> > >>>>> It looks like the process running yum is in state ptrace stopped, but > >>>>> doesn't continue. > >>>> > >>>> Got only yum stuck or the whole UML kernel? > >>> > > > > only some processes get stuck. > > > > After enabling hung task detection in the kernel I see this in the logs: > > > > [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120 > > seconds. > > [ 8040.100000] Not tainted 3.13.6 #24 > > [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > [ 8040.100000] jbd2/ubda-8 D 0000000040aa1c83 0 308 2 > > 0x00000000 > > [ 8040.100000] Stack: > > [ 8040.100000] 600795a0 603f3120 b066db70 6006476d > > [ 8040.100000] 1b066dbb0 b0631040 b066db80 6001c0a3 > > [ 8040.100000] 600795a0 b066b380 b066dbe0 60314b4e > > [ 8040.100000] Call Trace: > > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > > [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 > > [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 > > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > > [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 > > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > > [ 8040.100000] [<6006aa70>] ? prepare_to_wait+0x0/0x90 > > [ 8040.100000] [<60314e43>] schedule+0x33/0x80 > > [ 8040.100000] [<6020b9d4>] ? submit_bio+0xa4/0x1c0 > > [ 8040.100000] [<60315030>] io_schedule+0x60/0x90 > > [ 8040.100000] [<6010b7b0>] sleep_on_buffer+0x10/0x20 > > [ 8040.100000] [<603153a3>] __wait_on_bit+0x63/0xa0 > > [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 > > [ 8040.100000] [<6010b7a0>] ? sleep_on_buffer+0x0/0x20 > > [ 8040.100000] [<60315466>] out_of_line_wait_on_bit+0x86/0xa0 > > [ 8040.100000] [<6010e312>] ? submit_bh+0x12/0x20 > > [ 8040.100000] [<6006aed0>] ? wake_bit_function+0x0/0x40 > > [ 8040.100000] [<600645e3>] ? __might_sleep+0x153/0x170 > > [ 8040.100000] [<60064490>] ? __might_sleep+0x0/0x170 > > [ 8040.100000] [<6010b890>] ? __wait_on_buffer+0x0/0x40 > > [ 8040.100000] [<6010b8c4>] __wait_on_buffer+0x34/0x40 > > [ 8040.100000] [<6019a166>] jbd2_journal_commit_transaction > > +0x1696/0x19a0 > > [ 8040.100000] [<60315060>] ? _cond_resched+0x0/0x50 > > [ 8040.100000] [<6010d200>] ? __brelse+0x0/0x30 > > [ 8040.100000] [<600330c0>] ? block_signals+0x0/0x20 > > [ 8040.100000] [<6006abff>] ? finish_wait+0x6f/0x90 > > [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 > > [ 8040.100000] [<600487c0>] ? del_timer+0x0/0x60 > > [ 8040.100000] [<6006a950>] ? __wake_up+0x0/0x70 > > [ 8040.100000] [<6019dd5f>] kjournald2+0xdf/0x2d0 > > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > > [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 > > [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 > > [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 > > [ 8040.100000] [<6019dc80>] ? kjournald2+0x0/0x2d0 > > [ 8040.100000] [<6006a670>] ? __init_waitqueue_head+0x0/0x10 > > [ 8040.100000] [<6005ce44>] kthread+0x114/0x140 > > [ 8040.100000] [<6006442d>] ? finish_task_switch.isra.72+0x2d/0x90 > > [ 8040.100000] [<600656a2>] ? schedule_tail+0x22/0xd0 > > [ 8040.100000] [<6001be21>] new_thread_handler+0x81/0xb0 > > [ 8040.100000] > > [ 8040.100000] INFO: task yum:1082 blocked for more than 120 seconds. > > [ 8040.100000] Not tainted 3.13.6 #24 > > [ 8040.100000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > [ 8040.100000] yum D 0000000040aa1c83 0 1082 1073 > > 0x00000000 > > [ 8040.100000] Stack: > > [ 8040.100000] 600795a0 603f3120 aae3fc80 6006476d > > [ 8040.100000] 162b83078 b0631040 aae3fc90 6001c0a3 > > [ 8040.100000] 600795a0 aadeab40 aae3fcf0 60314b4e > > [ 8040.100000] Call Trace: > > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > > [ 8040.100000] [<6006476d>] ? dequeue_task+0x1d/0x50 > > [ 8040.100000] [<6001c0a3>] __switch_to+0x53/0x90 > > [ 8040.100000] [<600795a0>] ? rcu_sched_qs+0x0/0xd0 > > [ 8040.100000] [<60314b4e>] __schedule+0x1ae/0x470 > > [ 8040.100000] [<60068900>] ? pick_next_task_fair+0x0/0x1a0 > > [ 8040.100000] [<6003344f>] ? set_signals+0x3f/0x50 > > [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 > > [ 8040.100000] [<60314e10>] ? schedule+0x0/0x80 > > [ 8040.100000] [<60314e43>] schedule+0x33/0x80 > > [ 8040.100000] [<6006ace0>] ? prepare_to_wait_event+0x0/0x110 > > [ 8040.100000] [<6019d3a3>] jbd2_log_wait_commit+0xa3/0x110 > > [ 8040.100000] [<6006ae90>] ? autoremove_wake_function+0x0/0x40 > > [ 8040.100000] [<6019f228>] jbd2_complete_transaction+0x48/0x90 > > [ 8040.100000] [<6015162b>] ext4_sync_file+0x28b/0x320 > > [ 8040.100000] [<600dba2a>] ? vfs_read+0x13a/0x190 > > [ 8040.100000] [<6010a133>] do_fsync+0x53/0x80 > > [ 8040.100000] [<600db73d>] ? SyS_lseek+0x7d/0x90 > > [ 8040.100000] [<6010a442>] SyS_fsync+0x12/0x20 > > [ 8040.100000] [<6001fa68>] handle_syscall+0x68/0x90 > > [ 8040.100000] [<600370fd>] userspace+0x4fd/0x600 > > [ 8040.100000] [<60031e0f>] ? save_registers+0x1f/0x40 > > [ 8040.100000] [<6003a2c7>] ? arch_prctl+0x177/0x1b0 > > [ 8040.100000] [<6001bed5>] fork_handler+0x85/0x90 > > [ 8040.100000] > > > > any ideas? some synchronisation error in ext4? > > Hmm, maybe you suffer from the same issue this patch tries to address: > https://lkml.org/lkml/2014/2/14/733 Hi, many thanks for this tip. After applying above patch, the hung tasks don't appear any more! But I get a similar backtraces of hung tasks when suspending the host kernel and resuming after 8 hours or so. But this is probably a different error. > > Thanks, > //richard |
|
From: Jonathan R. <jon...@ti...> - 2014-05-20 14:52:42
|
On 2014-03-16, 11:09 AM, Richard Weinberger wrote: > Am 14.03.2014 15:57, schrieb Thomas Meyer: >> >> only some processes get stuck. >> >> After enabling hung task detection in the kernel I see this in the logs: >> >> [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120 >> seconds. ... >> >> any ideas? some synchronisation error in ext4? > > Hmm, maybe you suffer from the same issue this patch tries to address: > https://lkml.org/lkml/2014/2/14/733 I'm running into a very similar issue on vanilla 3.14.4 and 3.15-rc5, both appear to have that patch already applied. I've tested ext4, ext3, btrfs, all with the same issue: processes writing to disk in the guest randomly hang permanently after a few minutes of uptime. The guest reports 100% iowait, the host has no load and is totally fine. Any suggestions would be appreciated. Thanks, Jonathan INFO: task jbd2/ubda-8:347 blocked for more than 120 seconds. Not tainted 3.14.4 #5 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/ubda-8 D 000000000042fc13 0 347 2 0x00000000 Stack: 60538e60 7fb14820 8044dba0 80d2a4d0 7fa5a480 60021e1f 60038290 6007db40 80d2a1c0 6042adf2 00000000 6006d120 Call Trace: [<60021e1f>] ? __switch_to+0x4f/0x90 [<60038290>] ? block_signals+0x0/0x20 [<6007db40>] ? rcu_sched_qs+0x0/0xb0 [<6042adf2>] ? __schedule+0x1c2/0x510 [<6006d120>] ? pick_next_task_fair+0x0/0x190 [<60119d50>] ? sleep_on_buffer+0x0/0x20 [<60023340>] ? itimer_read+0x10/0x40 [<60119d50>] ? sleep_on_buffer+0x0/0x20 [<6042b173>] ? schedule+0x33/0x80 [<60234023>] ? submit_bio+0xa3/0x1d0 [<6042b692>] ? io_schedule+0xb2/0x130 [<6006fc00>] ? prepare_to_wait+0x0/0x90 [<60119d60>] ? sleep_on_buffer+0x10/0x20 [<6042b8b0>] ? __wait_on_bit+0x60/0xa0 [<60119d50>] ? sleep_on_buffer+0x0/0x20 [<6042ba2b>] ? out_of_line_wait_on_bit+0x8b/0xa0 [<601d1061>] ? journal_submit_commit_record.isra.25+0x181/0x240 [<6006f940>] ? wake_bit_function+0x0/0x40 [<6011b090>] ? __brelse+0x0/0x20 [<6011b090>] ? __brelse+0x0/0x20 [<6042b350>] ? _cond_resched+0x0/0x50 [<601d25a1>] ? jbd2_journal_commit_transaction+0x1481/0x1770 [<60038290>] ? block_signals+0x0/0x20 [<6004cb1f>] ? lock_timer_base.isra.37+0x3f/0x80 [<6004d6f0>] ? del_timer+0x0/0x60 [<6006fae0>] ? __wake_up+0x0/0x60 [<601d5b74>] ? kjournald2+0xd4/0x2b0 [<6042ae0b>] ? __schedule+0x1db/0x510 [<6006d120>] ? pick_next_task_fair+0x0/0x190 [<6006f900>] ? autoremove_wake_function+0x0/0x40 [<6006f9b0>] ? __init_waitqueue_head+0x0/0x10 [<601d5aa0>] ? kjournald2+0x0/0x2b0 [<6006f9b0>] ? __init_waitqueue_head+0x0/0x10 [<600609ef>] ? kthread+0x10f/0x140 [<6006817d>] ? finish_task_switch.isra.78+0x2d/0x90 [<60069612>] ? schedule_tail+0x22/0xd0 [<60021c02>] ? new_thread_handler+0x82/0xb0 INFO: task git:1150 blocked for more than 120 seconds. Not tainted 3.14.4 #5 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. git D 000000000042fc13 0 1150 1146 0x00000000 Stack: 60538e60 6052b1e0 7e801ce0 7fb9b750 807ddc00 60021e1f 60038290 6007db40 7fb9b440 6042adf2 604324c0 6006d120 Call Trace: [<60021e1f>] ? __switch_to+0x4f/0x90 [<60038290>] ? block_signals+0x0/0x20 [<6007db40>] ? rcu_sched_qs+0x0/0xb0 [<6042adf2>] ? __schedule+0x1c2/0x510 [<6006d120>] ? pick_next_task_fair+0x0/0x190 [<600385dc>] ? set_signals+0x3c/0x50 [<6006fd20>] ? prepare_to_wait_event+0x0/0x110 [<6042b140>] ? schedule+0x0/0x80 [<6042b173>] ? schedule+0x33/0x80 [<601d6693>] ? jbd2_log_wait_commit+0x93/0x100 [<6006f900>] ? autoremove_wake_function+0x0/0x40 [<6018b942>] ? ext4_sync_file+0x282/0x310 [<601187aa>] ? do_fsync+0x4a/0x80 [<60118b12>] ? SyS_fsync+0x12/0x20 [<600256a0>] ? handle_syscall+0x60/0x80 [<6003be9b>] ? userspace+0x49b/0x5a0 [<600256f0>] ? copy_chunk_to_user+0x0/0x30 [<60025985>] ? do_op_one_page+0x145/0x210 [<60025c71>] ? copy_to_user+0x61/0xb0 [<60036f5f>] ? save_registers+0x1f/0x40 [<6003ec40>] ? arch_prctl+0x190/0x1c0 [<60021cb5>] ? fork_handler+0x85/0x90 $ ps -eo pid,user,wchan=WIDE-WCHAN-COLUMN0000000000000 -o s,cmd PID USER WIDE-WCHAN-COLUMN0000000000000 S CMD 1 root pick_next_task_fair S /sbin/init 2 root pick_next_task_fair S [kthreadd] 3 root pick_next_task_fair S [ksoftirqd/0] 4 root pick_next_task_fair S [kworker/0:0] 5 root pick_next_task_fair S [kworker/0:0H] 6 root pick_next_task_fair S [kworker/u2:0] 7 root pick_next_task_fair S [watchdog/0] 8 root pick_next_task_fair S [khelper] 9 root pick_next_task_fair S [kdevtmpfs] 10 root pick_next_task_fair S [netns] 91 root pick_next_task_fair S [writeback] 93 root pick_next_task_fair S [bioset] 95 root pick_next_task_fair S [kblockd] 113 root pick_next_task_fair S [kworker/0:1] 123 root pick_next_task_fair S [khungtaskd] 124 root pick_next_task_fair S [kswapd0] 125 root pick_next_task_fair S [fsnotify_mark] 249 root pick_next_task_fair S [ipv6_addrconf] 346 root pick_next_task_fair S [deferwq] 347 root pick_next_task_fair D [jbd2/ubda-8] 348 root pick_next_task_fair S [ext4-rsv-conver] 366 root pick_next_task_fair S [kworker/0:1H] 468 root pick_next_task_fair S upstart-udev-bridge 472 root pick_next_task_fair S /lib/systemd 741 root pick_next_task_fair S [jbd2/ubdb-8] 742 root pick_next_task_fair S [ext4-rsv-conver] 890 root pick_next_task_fair S /usr/bin/docker -d 967 syslog pick_next_task_fair S rsyslogd 1006 root pick_next_task_fair S [kworker/u2:2] 1036 root pick_next_task_fair S /sbin/getty -8 38400 1037 root pick_next_task_fair S /sbin/getty -8 38400 1040 root pick_next_task_fair S /sbin/getty -8 38400 1041 root pick_next_task_fair S /bin/login -- 1043 root pick_next_task_fair S /sbin/getty -8 38400 1089 root pick_next_task_fair S /usr/sbin/sshd -D 1090 root pick_next_task_fair S cron 1092 root pick_next_task_fair S sshd: ubuntu [priv] 1096 root pick_next_task_fair S [kauditd] 1106 ubuntu pick_next_task_fair S sshd: ubuntu@notty 1107 ubuntu pick_next_task_fair S bash 1112 root pick_next_task_fair S /sbin/getty -8 38400 1132 root pick_next_task_fair S upstart-socket-bridge 1133 root pick_next_task_fair S upstart-file-bridge 1138 ubuntu pick_next_task_fair S /bin/bash 1141 ubuntu pick_next_task_fair S git clone 1142 ubuntu pick_next_task_fair S git-remote-https origin 1146 ubuntu pick_next_task_fair S git fetch-pack 1150 ubuntu pick_next_task_fair D git index-pack --stdin 1160 ubuntu pick_next_task_fair S -bash |
|
From: Richard W. <ri...@no...> - 2014-05-26 14:36:46
|
Am 20.05.2014 16:32, schrieb Jonathan Rudenberg: > On 2014-03-16, 11:09 AM, Richard Weinberger wrote: >> Am 14.03.2014 15:57, schrieb Thomas Meyer: >>> >>> only some processes get stuck. >>> >>> After enabling hung task detection in the kernel I see this in the logs: >>> >>> [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120 >>> seconds. > ... >>> >>> any ideas? some synchronisation error in ext4? >> >> Hmm, maybe you suffer from the same issue this patch tries to address: >> https://lkml.org/lkml/2014/2/14/733 > > I'm running into a very similar issue on vanilla 3.14.4 and 3.15-rc5, both appear to have that patch already applied. > > I've tested ext4, ext3, btrfs, all with the same issue: processes writing to disk in the guest randomly hang permanently after a few minutes of uptime. The guest reports 100% > iowait, the host has no load and is totally fine. Can you share your UML disk image? Maybe your userspace triggers some odds. What distro is it? Does it only happen with ubd disks? IOW does NFS or hostfs work? Thanks, //richard |