From: Erik H. <eri...@er...> - 2012-10-30 16:38:22
|
So, after the problems with packet loss due to full receive buffers where out of the way, I believe we have discovered a bug in the TIPC link layer. Using the testcase from: http://thread.gmane.org/gmane.network.tipc.general/3354 and inserting a piece of code dumping the first few octets from the packet payload after finished reassembly at link layer: > diff --git a/net/tipc/link.c b/net/tipc/link.c > index a79c755..917e3a3 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -1741,6 +1741,10 @@ deliver: > &buf, &msg); > if (ret == 1) { > l_ptr->stats.recv_fragmented++; > + unsigned int *test = (unsigned int*)msg_data(msg); > + u32 onode = msg_orignode(msg); > + pr_debug("[link] from %d.%d.%d recv id:\t %u\n", > + tipc_zone(onode), tipc_cluster(onode), tipc_node(onode), *test); > goto deliver; > } > if (ret == -1) I found that reassembly sometimes fails (never finishes) due to lost fragments, reproduced on a 12 node cluster. The test server reports after a while: > received msg 1511 but expected 1510 fd 7 > ^C The link level trace confirms that it's lost on the link layer: .... > Oct 30 15:55:42 Megatron user.warn kernel: [ 1968.657400] tipc: [link] from 1.1.5 recv id: 1506 > Oct 30 15:55:42 Megatron user.warn kernel: [ 1968.789924] tipc: [link] from 1.1.5 recv id: 1507 > Oct 30 15:55:42 Megatron user.warn kernel: [ 1968.962455] tipc: [link] from 1.1.5 recv id: 1508 > Oct 30 15:55:43 Megatron user.warn kernel: [ 1969.117469] tipc: [link] from 1.1.5 recv id: 1509 > Oct 30 15:55:45 Megatron user.warn kernel: [ 1971.023229] tipc: [link] from 1.1.5 recv id: 1511 > Oct 30 15:55:45 Megatron user.warn kernel: [ 1971.107994] tipc: [link] from 1.1.5 recv id: 1512 > Oct 30 15:55:45 Megatron user.warn kernel: [ 1971.211754] tipc: [link] from 1.1.5 recv id: 1513 .... Pcap can be found here: https://t.co/PYqNQDor Node 5 reports: Link <1.1.5:eth0-1.1.1:eth0> ACTIVE MTU:1500 Priority:10 Tolerance:1500 ms Window:50 packets RX packets:428 fragments:0/0 bundles:0/0 TX packets:151296 fragments:151293/3219 bundles:0/0 TX profile sample:831 packets average:4859 octets 0-64:0% -256:0% -1024:2% -4096:92% -16384:0% -32768:0% -66000:5% RX states:56134 probes:2959 naks:41862 defs:0 dups:0 TX states:9016 probes:2914 naks:0 acks:3143 dups:156035 Congestion bearer:0 link:3215 Send queue max:96 avg:6 Node 1: Link <1.1.1:eth0-1.1.5:eth0> ACTIVE MTU:1500 Priority:10 Tolerance:1500 ms Window:50 packets RX packets:151296 fragments:151293/3218 bundles:0/0 TX packets:428 fragments:0/0 bundles:0/0 TX profile sample:83 packets average:82 octets 0-64:96% -256:0% -1024:0% -4096:4% -16384:0% -32768:0% -66000:0% RX states:8270 probes:2893 naks:0 defs:290484 dups:90125 TX states:56106 probes:2945 naks:41862 acks:25868 dups:0 Congestion bearer:0 link:0 Send queue max:2 avg:0 Node 1 reports to have reassembled 1 less buffer from node5 than it has transmitted (3219 vs 3218), but the amount of transmitted/received fragments match. Pcap shows that from fragment message buffer id 3194, fragment number 7 is never sent out by node 5.. Probably, the reassembly buffer is still held by node1, waiting for fragment #7 to arrive, which it never does.. I thought i'd give you a heads up on this, i'll continue troubleshooting tomorrow. //E |