[Tipc-discussion] multicast and large messages

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I started testing multicasting large messages.  Ones that need to be
fragmented.  I noticed the they weren't being delivered.  I put some
debug prints in the driver and found that the messages were being sent
out OK.  The messages were also being received and reassembled.  The
problem is with net_route_msg.  It ends up calling net_route_named_msg
which just throws the message away.  Im not sure if this is the right
place to put this, but I added code to net_route_msg (see attached
patch) that helps.

--- net.c       9 Jun 2004 23:14:47 -0000       1.12
+++ net.c       14 Jun 2004 21:42:29 -0000
@@ -124,6 +124,7 @@
 #include "reg.h"
 #include "msg.h"
 #include "port.h"
+#include "bcast.h"
  
 /*
  * The TIPC locking policy is designed to ensure a very fine locking
@@ -321,7 +322,9 @@
                if (msg_isdata(msg)) {
                        if (msg_destport(msg))
                                port_recv_msg(buf);
-                       else
+                       else if (msg_mcast(msg))
+                               bcast_port_recv(buf);
+                       else
                                net_route_named_msg(buf);
                        return;
                }

I generally get the messages on my other nodes.  The bad part is, that
if I send 100 or so messages quickly, the machine panics with a NULL
pointer dereference. (See attached trace).

Unable to handle kernel NULL pointer dereference at virtual address 00000050
 printing eip:
f8e29009
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: tipc
CPU:    0
EIP:    0060:[<f8e29009>]    Not tainted
EFLAGS: 00010206   (2.6.7-rc2)
EIP is at link_recv_fragment+0xe9/0x760 [tipc]
eax: 00000044   ebx: 00000001   ecx: 00000000   edx: 5940057c
esi: d9736c7e   edi: d9736c7e   ebp: c050be30   esp: c050bdc8
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c050a000 task=c046a1c0)
Stack: d9f026bc dd2f1506 00000000 c050be04 00000000 c050be04 d958c01c 00000000
       00000044 000088ca dd2f04e8 f7bc28cc dcfbc804 dd2f0530 00000198 d9f02430
       d9736c7e 00000000 f2ab8e70 dd2f04e4 00000206 00000246 f5f481b0 00000001
Call Trace:
 [<c010614f>] show_stack+0x7f/0xa0
 [<c01062fe>] show_registers+0x15e/0x1c0
 [<c01064aa>] die+0x9a/0x160
 [<c0118946>] do_page_fault+0x2e6/0x5b9
 [<c0105dcd>] error_code+0x2d/0x38
 [<f8e2642e>] tipc_recv_msg+0x57e/0x8d0 [tipc]
 [<f8e45402>] recv_msg+0x42/0x70 [tipc]
 [<c037dca2>] netif_receive_skb+0x172/0x1b0
 [<c037dd64>] process_backlog+0x84/0x120
 [<c037de80>] net_rx_action+0x80/0x120
 [<c0125ff8>] __do_softirq+0xb8/0xc0
 [<c0126035>] do_softirq+0x35/0x40
 [<c0107d45>] do_IRQ+0x175/0x230
 [<c0105cd0>] common_interrupt+0x18/0x20
 [<c0103106>] cpu_idle+0x46/0x50
 [<c050c984>] start_kernel+0x184/0x1d0
 [<c01001e0>] 0xc01001e0
                         
Code: 8b 58 0c 89 d7 8b 4e 10 0f c9 8b 43 08 c1 e9 10 81 e2 ff ff
 <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

One other thing.  buf_safe_discard exits its while loop on the first
busy buffer.  Is it intended to not go through the whole list?

Mark.

-- 
Mark Haverkamp <ma...@os...>





[Tipc-discussion] multicast and large messages

Cluster wide IPC providing datagram, connection, and bus messaging

[Tipc-discussion] multicast and large messages