Donate Share

TIPC: Cluster Communication Protocol

Tracker: Bugs

3 Link misorders messages with multi-threaded drivers - ID: 2844381
Last Update: Tracker Item Submitted ( ajstephens )

A TIPC 1.7 link endpoint does not always processing incoming messages in
sequential order if its bearer utilizes a driver with multiple receiving
threads. This can be illustrated by the following sequence of events,
involving a node with multi-core CPUs A and B:

cpu A
1) invokes tipc_recv_msg() with packet N
2) performs read_lock_bh(&tipc_net_lock) [this ensures TIPC's view of the
network doesn't change in mid-packet]
3) performs tipc_node_lock() on the receiving link endpoint structure [to
ensure exclusive use]
4) does what processing is needed to ensure the packet is in the proper
order
5) performs tipc_node_unlock() on the receiving link endpoint structure
6) invokes tipc_port_recv_msg() to pass packet N to the receiving port

cpu B
a) invokes tipc_recv_msg() with packet N+1
b) performs read_lock_bh(&tipc_net_lock)
c) performs tipc_node_lock() on the receiving link endpoint structure
d) does what processing is needed to ensure the packet is in the proper
order
e) performs tipc_node_unlock() on the receiving link endpoint structure
f) invokes tipc_port_recv_msg() to pass packet N+1 to the receiving port

It is possible for cpu A to suspend processing temporarily after step 5),
allowing cpu B to perform steps a) through f) before cpu A does step 6.
This will result in packet N+1 reaching the receiving port before packet
N.

An interim workaround for this problem is to modify tipc_recv_msg() to do a
write lock on tipc_net_lock, rather than a read lock. This will prevent the
scenario described above from occurring, and hopefully will allow the link
to operate successfully.

Note: The suggested workaround Iis probably not an acceptable long term
solution for having TIPC function on systems with multi-threaded drivers.
Firstly, it forces TIPC to single-thread all incoming TIPC packets
(regardless of their source) which will result in a performance hit which
may or may not be significant. Secondly, the workaround does nothing to
address the issue of TIPC unnecessarily requesting the retransmission of
packets that arrive out of order, which will result in another performance
hit.


Allan Stephens ( ajstephens ) - 2009-08-25 17:20

3

Open

None

Nobody/Anonymous

Bug

None

Public


Comments




Log in to comment.

No follow-up comments have been posted.

Attached File

No Files Currently Attached

Change

No changes have been made to this artifact.