Menu

#196 ssi-addnode-dynamic causes new nodes to kernel panic

v1.9.3
open
nobody
5
2010-07-24
2010-07-24
djdoomsday
No

Running Debian v5 2.6.14.
Whenever we attempt to use the ssi-addnode-dynamic all nodes booting will kernel panic when init calls the DHCP client (udhcpc). The cluster works perfectly when using manual ssi-addnode. This has been tested in VMWARE (e1000 emulation) and several dissimilar physical machines.

Further debug info forth-coming - Any feedback on what you'd require?

Discussion

  • djdoomsday

    djdoomsday - 2010-07-27

    We've tried a many multitude of things such as another DHCP client, a newly compiled version of udhcpc, the version of udhcpc within the latest busybox, etc to no avail. Has anyone had any success with addnode-dynamic?

    On an unrelated note - Is this the right place to post? It has been difficult to find where this community hands out.

     
  • John Hughes

    John Hughes - 2010-07-27

    "Whenever we attempt to use the ssi-addnode-dynamic all nodes booting will kernel panic"

    What panic?

    "On an unrelated note - Is this the right place to post?"

    This is the right place to report bugs. If you have questions about using OpenSSI check the openssi-users mailing list. for development check the developers mailing list.

    For general info check the website http://www.openssi.org

     
  • djdoomsday

    djdoomsday - 2010-07-27

    Unfortunately I'm unable to get the exact error as I don't know how to dump the panic to any sort of log. I have patched together as many frames of a captured video session of it here: http://i32.tinypic.com/2rw5h5y.jpg

     
  • John Hughes

    John Hughes - 2010-07-27

    It's crashing maybe at at a call to kmem_cache_alloc

    The call trace is
    sys_select -> do_select -> ssidev_poll_cli_alloc -> BANG!

    Looks dodgy.

     
  • John Hughes

    John Hughes - 2010-07-27

    (by looks dodgy I mean it looks like we're trying to do some cluster stuff before the cluster is initialised).

     
  • Nobody/Anonymous

    mmm, if that is the case then the init script is broken - We did a lot of tracing and it happens right at when udhcpc is called. You should be able to easily replicate the situation by making sure the node you're adding has no static IP entry and you have udhcpc installed and added to /etc/mkinitrd/exe. Those are the conditions you require to have the init script call udhcpc.
    You can also see from the log that udhcpc does successfully start, but must explode somewhere within itself and cause init to die. I'm wondering if there isn't some dependency that we're missing for udhcpc?
    I also noticed that busybox is installed, but not used? Should we be using a busybox environment instead?

     
  • Roger Tsang

    Roger Tsang - 2010-07-28

    initialize ssidev_poll_* structures early

     
  • Roger Tsang

    Roger Tsang - 2010-07-28

    Try attached patch for linux-2.6.11-ssi.

     
  • djdoomsday

    djdoomsday - 2010-07-28

    mmm, I'm currently running 2.6.14. Are you suggesting I will need to compile ssi to 2.6.11 with this patch?

     
  • John Hughes

    John Hughes - 2010-07-28

    You'll need to port Rogers patch from 2.6.11 to 2.6.14, or wait for me to do it (I'll be able to get around to this this weekend - sorry it will take so much time but the disk on my laptop blew up and I have to rebuild my build environment (lenny chroot)).

     
  • djdoomsday

    djdoomsday - 2010-07-28

    Aye that would be great thanks. Take your time. :)

     

Log in to post a comment.