From: <fer...@ct...> - 2007-09-13 19:27:17
|
Hi, After some more debugging I think I've solved the problem (at least :) An explanation follows. First, although apparently it seemed that the modules causing the problem were the ones related with networking (as David Fernández said in his mail) that assumption was wrong. In fact, the modules causing the problem are the ones that print some message in the "kernel message buffer" (I don't know what is the right name, but I'm referring to the message log that appears when the dmesg command is used) when modprobe load them. For example, ip_tables prints something like "ip_tables: (C) 2000-2006 Netfilter Core Team". The solution to the problem is using con1 instead of con0, I mean: ./linux ubd0=/tmp/root_fs_debug con=null con1=pts uml_dir=/tmp umid=run instead of ./linux ubd0=/tmp/root_fs_debug con=null con1=pts uml_dir=/tmp umid=run In that case you can 'iptables -L' works without problems. Why? When the module is loaded it prints its message not only in the internal kernel message buffer but also in con0 (this can be checked running simply "./linux ubd0=/tmp/root_fs_debug", without con= redirectors). When con0 is redirected to null (con0=null) there is no problem, but if con0 is redirected to a pts (con0=pts) I guess that, in the moment of printing the message, some problem occurs with the output (as Jeff suggest in its mail) thus causing the vm hang. This explains also why putting the module in /etc/modules works. The loading of the list of modules in /etc/modules is performed before UML assign virtual console to pts devices (the sequence can be checked observing the booting log). I think that my solution is more a workaround that a definitive solution. Why when con0 is assigned to a pts modules can not (because of it hangs the vm) print its message but when it is assigned to null it works? Is there a bug in the UML kernel that need to be fixed? Or maybe the bug is in modprobe? I leave the question open for the experts in the UML internals... :) Regarding the tests suggested by Paolo: > Please try logging in via SSH and reproducing the problem and the stacktrace, > and also removing con=null - also have you double checked con=null is ok > (maybe it was con=none, I'm not sure). I'm not sure screen is perfectly safe > to use (it should be). Do you really need that I perform these test or considers the report above is enough? If it's really needed I can do them, but it would take me some time (and maybe now it isn't a good idea because of they won't provide additional useful information :) Best regards, -------------------- Fermín Galán Márquez CTTC - Centre Tecnològic de Telecomunicacions de Catalunya Parc Mediterrani de la Tecnologia, Av. del Canal Olímpic s/n, 08860 Castelldefels, Spain Room 1.02 Tel : +34 93 645 29 12 Fax : +34 93 645 29 01 Email address: fermin dot galan at cttc dot es |