ssi-addnode-dynamic causes new nodes to kernel panic
Brought to you by:
brucewalker,
rogertsang
Running Debian v5 2.6.14.
Whenever we attempt to use the ssi-addnode-dynamic all nodes booting will kernel panic when init calls the DHCP client (udhcpc). The cluster works perfectly when using manual ssi-addnode. This has been tested in VMWARE (e1000 emulation) and several dissimilar physical machines.
Further debug info forth-coming - Any feedback on what you'd require?
We've tried a many multitude of things such as another DHCP client, a newly compiled version of udhcpc, the version of udhcpc within the latest busybox, etc to no avail. Has anyone had any success with addnode-dynamic?
On an unrelated note - Is this the right place to post? It has been difficult to find where this community hands out.
"Whenever we attempt to use the ssi-addnode-dynamic all nodes booting will kernel panic"
What panic?
"On an unrelated note - Is this the right place to post?"
This is the right place to report bugs. If you have questions about using OpenSSI check the openssi-users mailing list. for development check the developers mailing list.
For general info check the website http://www.openssi.org
Unfortunately I'm unable to get the exact error as I don't know how to dump the panic to any sort of log. I have patched together as many frames of a captured video session of it here: http://i32.tinypic.com/2rw5h5y.jpg
It's crashing maybe at at a call to kmem_cache_alloc
The call trace is
sys_select -> do_select -> ssidev_poll_cli_alloc -> BANG!
Looks dodgy.
(by looks dodgy I mean it looks like we're trying to do some cluster stuff before the cluster is initialised).
mmm, if that is the case then the init script is broken - We did a lot of tracing and it happens right at when udhcpc is called. You should be able to easily replicate the situation by making sure the node you're adding has no static IP entry and you have udhcpc installed and added to /etc/mkinitrd/exe. Those are the conditions you require to have the init script call udhcpc.
You can also see from the log that udhcpc does successfully start, but must explode somewhere within itself and cause init to die. I'm wondering if there isn't some dependency that we're missing for udhcpc?
I also noticed that busybox is installed, but not used? Should we be using a busybox environment instead?
initialize ssidev_poll_* structures early
Try attached patch for linux-2.6.11-ssi.
mmm, I'm currently running 2.6.14. Are you suggesting I will need to compile ssi to 2.6.11 with this patch?
You'll need to port Rogers patch from 2.6.11 to 2.6.14, or wait for me to do it (I'll be able to get around to this this weekend - sorry it will take so much time but the disk on my laptop blew up and I have to rebuild my build environment (lenny chroot)).
Aye that would be great thanks. Take your time. :)