I am seeing a unexpected behaviour with the linux bonding driver (LACP mode).There are two channels.links (link1 and link2) that are in LACP bond and both links are having the collecting and distributing as enabled. Now link 1 receves a LACPDU with collecting as disabled from the remote node.
At this point the expectation (as per the LACP standards) is that the (a) bonding driver should immediately send a LACPDU to the remode node (over link1) with distributing as disabled (b) Shift the traffic/packets to link2.
But what i see is that linux bonding driver ignores this LACPDU from the remode node on link1 and does not make the distributing as diabled. Also it does not shift the traffic to link2 and continues to send traffic over link1 to the remote host. This in my opinion is not as per the LACP standards and seems to be bug
Sandesh,
What kernel did you test with, and how did you induce the problem?
Hi Jay,
Thanks for responding!. Please find below the answer to your queries
The testing was done on 3.10 version (3.10.0-229.e17.x86 64) of the kernel.It is a Red Hat Enterprise Linux Server 7.1 (Maipo)..
We had a blade switch connected to server. The blade switch is running a proprietary version of LACP protocol. We sent LACPDU's from the LA module on the blade swwitch with the collecting bit disabled.
We are primarily doing this "experiment" to bring down traffic gracefully and avoiding traffic loss of "in-flight" packets.
Additional point to note is that the (a) LA module implementation running on he Cisco switch and (b)LA module implementation on the above mentioned blade switch react differently (as compared to the Linux bonding driver implementation) upon receiving the collecting bit disabled. They immediately send a LACPDU to the peer with the distributing bit disabled and switch traffic to the other link of the LAG. This behaviour is in conformance to the LACP standards
Please let me know if you need any additional information
Regards
Sandesh M
Please let me know if there are any updates for this issue.