Thread: [SSI-devel] [ ssic-linux-Bugs-924120 ] ethtool crashes system
Brought to you by:
brucewalker,
rogertsang
From: SourceForge.net <no...@so...> - 2004-03-26 19:07:51
|
Bugs item #924120, was opened at 2004-03-27 00:37 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 6 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Kai-Min Sung (ksung) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-03-26 22:16:47
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Comment added) made by bjbrew You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 6 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Kai-Min Sung (ksung) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-04-05 23:12:43
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Comment added) made by bjbrew You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 6 Submitted By: Aneesh Kumar K.V (kvaneesh) >Assigned to: Kishore Sampathkumar (kishoreks) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 16:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-04-06 10:33:53
|
Bugs item #924120, was opened at 2004-03-27 00:37 Message generated for change (Settings changed) made by keerthi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None >Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) >Assigned to: Keerthi Bhushan (keerthi) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-06 04:42 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-27 03:46 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-06-18 01:20:49
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Comment added) made by brucewalker You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Keerthi Bhushan (keerthi) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:20 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 16:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-06-18 01:23:54
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Comment added) made by brucewalker You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Keerthi Bhushan (keerthi) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:23 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:20 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 16:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-06-18 01:25:31
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Comment added) made by brucewalker You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Keerthi Bhushan (keerthi) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:25 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:23 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:20 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 16:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-07-07 12:27:48
|
Bugs item #924120, was opened at 2004-03-27 00:37 Message generated for change (Comment added) made by keerthi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Keerthi Bhushan (keerthi) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Keerthi Bhushan (keerthi) Date: 2004-07-07 17:57 Message: Logged In: YES user_id=825227 ethtool -t eth0 on a two node RHEL based SSI cluster caused the secondary node to try to failover and since I had no failover ability it crashed. ethtool -t eth0 by default does offline selftest of the eth0 device. I noticed that "ethtool -t eth0 online" doesn't crash the secondary node. The momentary loss of connection with init node is the reason. I have the following options: 1. make ethtool crash the secondary node more gracefully (but how I don't know) 2. make online the default option on SSI and warn when user provides the offline option that the secondary nodes might try to failover (and crash potentially). ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-18 06:55 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-18 06:53 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-18 06:50 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-06 04:42 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-27 03:46 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-07-07 20:53:35
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Comment added) made by bjbrew You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Keerthi Bhushan (keerthi) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Brian J. Watson (bjbrew) Date: 2004-07-07 13:53 Message: Logged In: YES user_id=16302 Another option is to make the ICS interface resistant to any meddling from user-mode. ---------------------------------------------------------------------- Comment By: Keerthi Bhushan (keerthi) Date: 2004-07-07 05:27 Message: Logged In: YES user_id=825227 ethtool -t eth0 on a two node RHEL based SSI cluster caused the secondary node to try to failover and since I had no failover ability it crashed. ethtool -t eth0 by default does offline selftest of the eth0 device. I noticed that "ethtool -t eth0 online" doesn't crash the secondary node. The momentary loss of connection with init node is the reason. I have the following options: 1. make ethtool crash the secondary node more gracefully (but how I don't know) 2. make online the default option on SSI and warn when user provides the offline option that the secondary nodes might try to failover (and crash potentially). ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:25 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:23 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:20 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 16:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-10-26 06:43:04
|
Bugs item #924120, was opened at 2004-03-27 00:37 Message generated for change (Settings changed) made by keerthi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None Priority: 5 Submitted By: Aneesh Kumar K.V (kvaneesh) >Assigned to: Nobody/Anonymous (nobody) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-07-08 02:23 Message: Logged In: YES user_id=16302 Another option is to make the ICS interface resistant to any meddling from user-mode. ---------------------------------------------------------------------- Comment By: Keerthi Bhushan (keerthi) Date: 2004-07-07 17:57 Message: Logged In: YES user_id=825227 ethtool -t eth0 on a two node RHEL based SSI cluster caused the secondary node to try to failover and since I had no failover ability it crashed. ethtool -t eth0 by default does offline selftest of the eth0 device. I noticed that "ethtool -t eth0 online" doesn't crash the secondary node. The momentary loss of connection with init node is the reason. I have the following options: 1. make ethtool crash the secondary node more gracefully (but how I don't know) 2. make online the default option on SSI and warn when user provides the offline option that the secondary nodes might try to failover (and crash potentially). ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-18 06:55 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-18 06:53 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-18 06:50 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-06 04:42 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-27 03:46 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2004-10-27 19:22:13
|
Bugs item #924120, was opened at 2004-03-26 11:07 Message generated for change (Settings changed) made by bjbrew You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Category: Networking Group: None Status: Open Resolution: None >Priority: 1 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Nobody/Anonymous (nobody) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-07-07 13:53 Message: Logged In: YES user_id=16302 Another option is to make the ICS interface resistant to any meddling from user-mode. ---------------------------------------------------------------------- Comment By: Keerthi Bhushan (keerthi) Date: 2004-07-07 05:27 Message: Logged In: YES user_id=825227 ethtool -t eth0 on a two node RHEL based SSI cluster caused the secondary node to try to failover and since I had no failover ability it crashed. ethtool -t eth0 by default does offline selftest of the eth0 device. I noticed that "ethtool -t eth0 online" doesn't crash the secondary node. The momentary loss of connection with init node is the reason. I have the following options: 1. make ethtool crash the secondary node more gracefully (but how I don't know) 2. make online the default option on SSI and warn when user provides the offline option that the secondary nodes might try to failover (and crash potentially). ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:25 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:23 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 18:20 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 16:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 14:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |
From: SourceForge.net <no...@so...> - 2005-08-26 21:16:05
|
Bugs item #924120, was opened at 2004-03-26 14:07 Message generated for change (Comment added) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Networking Group: None >Status: Closed >Resolution: Works For Me Priority: 1 Submitted By: Aneesh Kumar K.V (kvaneesh) Assigned to: Nobody/Anonymous (nobody) Summary: ethtool crashes system Initial Comment: Laura, I've used ethtool successfully before on the cluster. But, this time, I ran ethtool (on the external NIC of the initnode) and it caused the system to lock. I got the following on the console of the second nodes: Instruction(i) breakpoint #0 at 0xc0124840 (adjusted) 0xc0124840 panic_hook: int3 Entering kdb (current=0xf71cc000, pid 131150) on processor 0 due to Breakpoint @ 0xc0124840 kdb> bt Stacktrace for pid 131150 0xf71cc000 131150 2 1 0 R 0xf71cc420 *nm cli nd daemo EBP EIP Function(args) 0xf71cdf4c 0xc0124840 panic_hook (0xc038b298, 0xc05bbae0, 0xf71cdf58, 0x0) kernel .text 0xc0100000 0xc0124840 0xc0124850 0xc0124897 panic+0x47 (0xc03a2840, 0x2e323931, 0x2e383631, 0x312e30, 0x0 kernel .text 0xc0100000 0xc0124850 0xc0124990 0xf71cdfd4 0xc026911e clms_attempt_failover+0x15e (0x1, 0x0) kernel .text 0xc0100000 0xc0268fc0 0xc0269260 0xf71cdfec 0xc02902d5 nm_client_nodedown_daemon+0x55 (0x0) kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0290280 nm_clietn_nodedown_daemon kernel .text 0xc0100000 0xc0290280 0xc02902f0 0xc0107889 kernel_thread_helper+0x5 kernel .text 0xc0100000 0xc0107884 0xc0107890 -- Jiann-Ming Su js...@em... 404-712-2603 Development Team Systems Administrator General Libraries Systems Division ---------------------------------------------------------------------- >Comment By: Roger Tsang (rogertsang) Date: 2005-08-26 17:16 Message: Logged In: YES user_id=1246761 ethtool works for me on 1.2.2 and 1.9.1 ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-07-07 16:53 Message: Logged In: YES user_id=16302 Another option is to make the ICS interface resistant to any meddling from user-mode. ---------------------------------------------------------------------- Comment By: Keerthi Bhushan (keerthi) Date: 2004-07-07 08:27 Message: Logged In: YES user_id=825227 ethtool -t eth0 on a two node RHEL based SSI cluster caused the secondary node to try to failover and since I had no failover ability it crashed. ethtool -t eth0 by default does offline selftest of the eth0 device. I noticed that "ethtool -t eth0 online" doesn't crash the secondary node. The momentary loss of connection with init node is the reason. I have the following options: 1. make ethtool crash the secondary node more gracefully (but how I don't know) 2. make online the default option on SSI and warn when user provides the offline option that the secondary nodes might try to failover (and crash potentially). ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 21:25 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 21:23 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Bruce J. Walker (brucewalker) Date: 2004-06-17 21:20 Message: Logged In: YES user_id=296932 I tried a simple ethertool eth0 on a RH9 SSI rc5 system (init node and non-initnode). Also did ethertool eth1. Had no problems. See if you can reproduce and try to provide the bt on the node that panics. ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-04-05 19:12 Message: Logged In: YES user_id=16302 Kishore, Duplicate this bug and find out what's happening on the init node. Then we'll have a better idea of what needs to be fixed. Thanks, Brian ---------------------------------------------------------------------- Comment By: Brian J. Watson (bjbrew) Date: 2004-03-26 17:16 Message: Logged In: YES user_id=16302 The backtrace merely means that the second node lost its root node. What happened on the root node is much more interesting. Brian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=924120&group_id=32541 |