OpenSSI Clusters for Linux / Bugs / #70 Can't halt initnode

David Zafman - 2004-07-31

Logged In: YES
user_id=297844

Another minor issue is that the ramdisk wanted to halt a booting initnode
which failed to mount the root. Because of the way we are performing
the halt operation, instead of getting a clean halt, the node ends up
panic'ing in nodedown because it was a simultaneous boot and other
nodes were present. Looking at the stack we could fix
cfs_nodedown_thread(), but I believe that fixing the halt code in this bug
report eliminates the need to. This is because there could be other
panics due to the bad state of this machine.

Creating root device
mkrootdev: label /1 not found
mount: special device /dev/root does not exist
ERROR: Mounting root file system failed.
Unable to continue. Halting.
nm_add_node: Node 3 added
nm_add_node: Node 2 added
nm_add_node: Node 4 added
RTNL: assertion failed at devinet.c(825)
RTNL: assertion failed at devinet.c(825)
RTNL: assertion failed at igmp.c(556)
RTNL: assertion failed at igmp.c(529)
flushing ide devices: hda
System halted.
Node 2 has gone down!!!
Node 3 h<as1 >gUonnabe led otwno! !h!an
leNo dkee rn4e lha Ns UgLLo npeo idontwenr! !!dreferenceUna abtle
vitort uhaanld laded kreersnse l0 00NU00L4L1 0pon tperri dnetirengfe
eriepnc: i<c40>2 a3t6 a2vdi
etu*apld ae d=d re0s0s00 00000000041O0ps : pr00i0nt0ig teliapn :t
nlicp0 2m3i6ia 2cdpqfc* pdsey m=53 c0800xx0 0s0d0_0od scsi_mod
mCPU: 0EIP: 0060:[<c0236a2d>] Not taintedEFLAGS: 00010286
EIP is at cfs_nodedown_thread [kernel] 0x1d (2.4.20sandbox-
dzafman)eax: 00000400 ebx: c32c8000 ecx: 00000000 edx:
c3e0d800esi: 00000000 edi: 00000000 ebp: c32c9fec esp: c32c9fe8
ds: 0068 es: 0068 ss: 0068Process cfs failover (pid: 65689,
stackpage=c32c9000)
Stack: c0236a10 00000000 c010776d 00000002 00000000 00000000
Call Trace:
[<c0236a10>] cfs_nodedown_thread [kernel] 0x0 (0xc32c9fe8)
[<c010776d>] kernel_thread_helper [kernel] 0x5 (0xc32c9ff0)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Roger Tsang - 2007-10-12

labels: --> Booting / init

priority: 5 --> 3
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Roger Tsang - 2007-10-12

Logged In: YES
user_id=1246761
Originator: NO

Need to validate SSI-1.9.3

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Hughes - 2007-10-12

Logged In: YES
user_id=166336
Originator: NO

Still present in 1.9.3.

node1:~# clusternode_shutdown -h -N 1 now

Broadcast message from root (1/ttyS0) (Fri Oct 12 10:47:31 2007):

Node 1 is going down for system halt NOW!
[...]
Deactivating swap...done.
Unmounting file systems:
umount2: Device or resource busy
umount: /boot: device is busy
umount2: Device or resource busy
umount: /boot: device is busy
/boot:
Unmounting file systems (retry):
[...]
System halted.
Node 2 has gone down!!!

Debian GNU/Linux 3.1 node1 tty1

Node1 login:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Roger Tsang - 2008-04-20

Logged In: YES
user_id=1246761
Originator: NO

You must have tested a non-initnode in 1.9.3 because `clusternode_shutdown -h -N {initnode_num}` has been disabled by dzafman.

2.0.0pre3 fixes this bug for `clusternode_shutdown -h -N {potential_initnode|compute_node}`.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Roger Tsang - 2008-04-20

milestone: --> v1.2.0
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Can't halt initnode

Group

Searches

Help

#70 Can't halt initnode

Discussion