#103 crashes with 4KSTACKS

v1.9.1
closed-fixed
nobody
1
2008-01-02
2005-11-27
lmc
No

The default kernel for 1.9, compiled with
CONFIG_4KSTACKS does crash after havin been used for a
while. Please consider not setting CONFIG_4KSTACKS as a
default.

Discussion

  • Roger Tsang

    Roger Tsang - 2005-11-28

    Logged In: YES
    user_id=1246761

    Can you post any information that will clearly explain how
    the crash is directly related to CONFIG_4KSTACKS?

     
  • lmc

    lmc - 2005-11-28

    Logged In: YES
    user_id=1190855

    Sorry, I have no direct evidence, save from my cluster being
    up and running since I compiled my own kernel, with 4kstaks
    disabled, after some days of sudden crashes after standard
    install... at first I tried to collect evidence via
    netconsole, applied the netconsole patch for sis900 and
    experienced the non-initnode netconsole crashes; thereafter
    I compiled a non-4kstaks kernel, and the crashes stopped.

     
  • Roger Tsang

    Roger Tsang - 2006-01-26
    • status: open --> open-wont-fix
     
  • Roger Tsang

    Roger Tsang - 2007-03-25
    • priority: 5 --> 7
    • status: open-wont-fix --> open
     
  • John Hughes

    John Hughes - 2007-08-07

    Logged In: YES
    user_id=166336
    Originator: NO

    For me it always crashes with 4Kstacks if I try to use drbd. Non-drbd configs work ok with 4kstacks

     
  • Roger Tsang

    Roger Tsang - 2007-08-13
    • milestone: --> v1.9.1
    • priority: 7 --> 3
     
  • Roger Tsang

    Roger Tsang - 2007-08-13

    Logged In: YES
    user_id=1246761
    Originator: NO

    lowered priority; not a show stopper.

     
  • Roger Tsang

    Roger Tsang - 2007-10-12
    • priority: 3 --> 1
    • status: open --> open-later
     
  • Roger Tsang

    Roger Tsang - 2008-01-02
    • status: open-later --> closed-fixed
     
  • Roger Tsang

    Roger Tsang - 2008-01-02

    Logged In: YES
    user_id=1246761
    Originator: NO

    OPENSSI-2-0-0-PRE2 no longer has CONFIG_4KSTACKS as default.

     
  • John Hughes

    John Hughes - 2009-01-13

    One coding style in OpenSSI that leads to "excessive" stack usage is:

    some_cfs_func (...) {
    if (for this node) {
    do on this node;
    }
    else {
    rpcargs args;
    rpcret ret;
    args = ...;
    status = rpccall (...);
    ...
    }
    }

    The problem is that the space used by the rpc args and ret is allocated on the stack even on the path where the operation is local. This path is often (always?) deeper (especially when drbd is being used!).

    A "solution" may be to rework the function to look something like:

    some_cfs_func (...) {
    if (for this node) {
    do on this node;
    }
    else {
    status = some_cfs_func_remote ();
    }
    }

    some_cfs_func_remote (...) {
    rpcargs args;
    rpcret ret;
    args = ...;
    status = rpccall (...);
    ...
    }

    Now the local path doesn't get the rpc args on the stack.

    (E.G. for cfs_proc_rename only 44 bytes are needed on the local path as against 472 bytes on the rpc path).

     
  • Roger Tsang

    Roger Tsang - 2009-01-21

    See the following:
    [ ssic-linux-Patches-2525429 ] help get backtraces of (near) stack overflow situations
    [ ssic-linux-Patches-2525436 ] reduce cluster/sync.h stack usage
    [ ssic-linux-Patches-2525437 ] reduce stack usage in do_ssi_write
    [ ssic-linux-Patches-2525441 ] reduce stack usage in local CFS path

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks