From: Vlad Y. <vla...@hp...> - 2007-12-19 14:49:30
|
Nathan Straz wrote: > On Dec 18 11:35, Vlad Yasevich wrote: >> Nathan Straz wrote: >>> Okay, so because I reuse the src port on the rebooted host, I get an >>> association restart and the connection as far as the server is concerned >>> is restored. Is that behavior specific to SCTP? >> Yes. >> >>>> If you test doesn't expect this, you need to randomize your ports, but >>>> even then this situation may happen anyway. >>> I'm not sure how to set the source port of a connection so I think I'll >>> teach my program how to handle the association restart. Is there a way >>> to detect the association restart on the client or server side? >> To set the source port of the connection, bind to an explicit port. You can >> either call random() and bind to a that random port or apply the kernel patch >> and that will try to randomize ports picked by the kernel. > > Ah, good to know. I decided to just handle the reconnect since my > protocol includes a join message. Everything was working great until I > hit a panic on two nodes. I filed a bug on it in the Red Hat Bugzilla. > > http://bugzilla.redhat.com/show_bug.cgi?id=426234 > > I don't know if this backtrace will ring any bells. See if this fixes your issue: http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commitdiff_plain;h=f26f7c480555812ca7c4037e0a50fa54afe2cb4a -vlad > > BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024 > printing eip: > f8e12ffb > *pde = 55add067 > Oops: 0000 [#1] > SMP > last sysfs file: /devices/pci0000:00/0000:00:00.0/irq > Modules linked in: sctp gfs(U) lock_dlm gfs2 dlm configfs autofs4 hidp rfcomm > l2cap bluetooth sunrpc ipv6 dm_multipath video sbs backlight i2c_ec button > battery asus_acpi ac lp intel_rng floppy e7xxx_edac e1000 edac_mc i2c_i801 > ide_cd parport_pc pcspkr i2c_core parport cdrom sg dm_snapshot dm_zero dm_mirror > dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd > ehci_hcd ohci_hcd uhci_hcd > CPU: 0 > EIP: 0060:[<f8e12ffb>] Not tainted VLI > EFLAGS: 00010286 (2.6.18-53.el5 #1) > EIP is at sctp_get_port+0x12/0x44 [sctp] > eax: 00000000 ebx: ef935b80 ecx: f8e22ae0 edx: ef935b80 > esi: ef935b80 edi: 00000000 ebp: 00000010 esp: ee797ea0 > ds: 007b es: 007b ss: 0068 > Process d_doio (pid: 9256, ti=ee797000 task=e4db8aa0 task.ti=ee797000) > Stack: c05a5a4a f133e500 00000040 f7bb7d80 ee797ebc c05a4ad2 ef935b80 ef935b80 > ef935b80 ee797eec c05e6d1f f8e17bc0 c05e75f1 f8e17bc0 f5a8eb00 ee797eec > 00000010 c05a46d8 00000002 2b810002 62590f0a 00000000 00000000 00000000 > Call Trace: > [<c05a5a4a>] lock_sock+0x8e/0x96 > [<c05a4ad2>] sys_recvfrom+0x101/0x137 > [<c05e6d1f>] inet_autobind+0x1c/0x51 > [<c05e75f1>] inet_dgram_connect+0x30/0x4e > [<c05a46d8>] sys_connect+0x7d/0xa9 > [<c0492369>] inotify_d_instantiate+0x3c/0x5f > [<c0483308>] d_rehash+0x1c/0x2b > [<c05a3e90>] sock_attach_fd+0x6c/0xcc > [<c06058c7>] _spin_lock_bh+0x8/0x18 > [<c05a5a4a>] lock_sock+0x8e/0x96 > [<c05a5a4a>] lock_sock+0x8e/0x96 > [<c06058c7>] _spin_lock_bh+0x8/0x18 > [<c05a4ec1>] sys_socketcall+0x8c/0x19e > [<c0407ef7>] do_syscall_trace+0xab/0xb1 > [<c0404eff>] syscall_call+0x7/0xb > ======================= > Code: ff ff bd f4 ff ff ff e9 5f fb ff ff 81 c4 24 02 00 00 89 e8 5b 5e 5f 5d c3 > 57 89 d7 56 53 89 c3 83 ec 1c 8b 80 f4 01 00 00 89 da <8b> 48 24 89 e0 ff 51 30 > 0f b7 d7 89 d0 c1 e8 08 c1 e2 08 09 c2 > EIP: [<f8e12ffb>] sctp_get_port+0x12/0x44 [sctp] SS:ESP 0068:ee797ea0 > <0>Kernel panic - not syncing: Fatal exception > > Nate Straz > |