Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

LACP doesn't work when one port out of team

Help
Anonymous
2011-04-08
2013-06-06

  • Anonymous
    2011-04-08

    Redhat 5.6
    Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008)

    We have 2xCisco 3750 switches connected together to form a single switch. Pairs of ports are configured as passive lacp on the switches (one on each physical switch).
    The machines are using 2xIntel e1000e controllers.
    Previously they had FreeBSD 7 on the boxes and when configured to LACP worked without issue. Pulling ports at random and even connecting them incorrectly caused no issues as long as at least one port was correctly connected.
    During the install of RH 5.6 we had to reconnect port 0 to a 'normal' non-lacp configured switch port to get a network without setting up bonding during the install.
    Once up I configured LACP bonding over both ports, even though one port was left on a non-LACP configured switch port.
    On FreeBSD this functioned as expected. LACP negotiated that only one port was in the LACP bond and ignored the other.
    However, with RH a ping shows that the network appears and disappears at regular intervals. This is only fixed by an ifconfig down on the port that is connected to a non-LACP port.
    Why does this occur? I would've expected the same behaviour as FreeBSD. Connecting one port incorrectly should not break the network. After all bonding is done for reliability. The most reliable and logical outcome is to continue to function on the one good port?
    Any ideas why this occurs? Let me know if you need any more information.
    Thanks.

     
  • Jay Vosburgh
    Jay Vosburgh
    2011-04-08

    This precise scenario is a new one to me, but I suspect it may be the same root cause as a similar problem that I've been working on fixing.  The bug is a problem in the linux 802.3ad implementation that manifests sometimes when multiple aggregators are connected simultaneously.

    Can you reverse the order that the slaves are added to bonding?  Right now, my guess is that your non-LACP switch port is added to the bond first, and then the other bonding slave becomes the active aggregator.

     

  • Anonymous
    2011-04-13

    Swapping the interface order in the ifenslave (putting the LACP configured switch port 1st) got a working setup.
    Is this bug fixed in a later version of the bonding code than v3.4.0-1 ?

    BTW is the latest bonding code really over 2yrs old?

     
  • Jay Vosburgh
    Jay Vosburgh
    2011-04-13

    The bug is not fixed anywhere, and exists in all versions of bonding.  I've been working on a fix over the last couple of weeks, but it's one of those things that's simple in theory but not so simple to implement.  The problem, basically, is that bonding uses the same MAC address for all aggregators; trouble happens if the slave that owns that MAC address ends up in an inactive aggregator.  By switching the order of enslavement, you get the bond's MAC address from a slave that ends up in the active aggregator, so things work ok.

    And, no, the most recent bonding code isn't two years old (the current version has had changes just a few days ago).  The base version that Red Hat used is 3.4.0, from 2008.  They've added various patches to it (hence the "-1" in the version), but didn't update the date.  Even on the current version the date is in 2010; having a date really isn't all that useful, but we're kind of stuck with it.  The version numbers are a little bit useful; on a distro kernel, they at least specify which base version of bonding was used, but it's usually necessary to go look at the source code anyway, because the version skews between distros (so a "3.4.0-1" on distro A isn't the same as "3.4.0-1" on distro B), and sometimes patches are added without the version number being changed or patches are taken piecemeal and so on.

     

  • Anonymous
    2011-04-15

    Thanks for the detailed reply.
    I wondered if looking at how the *BSDs handle this might help? Please don't think I'm trying to teach you to suck eggs.
    FreeBSD 7.3 on the same hardware with the same network config (port 0 on a non-LACP port and port 1 correctly on an LACP port) uses the MAC address of port 0 for both interfaces, but still works correctly:

    # grep ^em /var/run/dmesg.boot
    em0: <Intel(R) PRO/1000 Network Connection 6.9.6> port 0xec00-0xec1f mem 0xdefe0000-0xdeffffff,0xdefc0000-0xdefdffff irq 33 at device 0.0 on pci8
    em0: Using MSI interrupt
    em0:
    em0: Ethernet address: 00:1f:29:61:b1:bc
    em1: <Intel(R) PRO/1000 Network Connection 6.9.6> port 0xe880-0xe89f mem 0xdef80000-0xdef9ffff,0xdef60000-0xdef7ffff irq 33 at device 0.1 on pci8
    em1: Using MSI interrupt
    em1:
    em1: Ethernet address: 00:1f:29:61:b1:bd
    # ifconfig em0
    em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
            ether 00:1f:29:61:b1:bc
            media: Ethernet autoselect (1000baseTX <full-duplex>)
            status: active
            lagg: laggdev lagg0
    # ifconfig em1
    em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
            ether 00:1f:29:61:b1:bc
            media: Ethernet autoselect (1000baseTX <full-duplex>)
            status: active
            lagg: laggdev lagg0
    # ifconfig lagg0
    lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
            options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
            ether 00:1f:29:61:b1:bc
            inet 146.x.x.70 netmask 0xffffff80 broadcast 146.x.x.127
            media: Ethernet autoselect
            status: active
            laggproto lacp
            laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
            laggport: em0 flags=18<COLLECTING,DISTRIBUTING>

    Their latest stable code is here:

    http://svn.freebsd.org/viewvc/base/stable/8/sys/net/if_lagg.c?view=markup&pathrev=216730

    Perhaps the authors of that could shed some light on how they tackled it?
    Thanks again for your help and for all the good work you've done on this. I hope you nail this issue.
    Cheers.

     
  • Jay Vosburgh
    Jay Vosburgh
    2011-04-15

    The 802.3ad standard (802.1AX nowadays) requires that each aggregator have a unique MAC address.  Linux doesn't do this, so when the slaves are split across two aggregators (e.g., if some are connected to one switch and some to another, where the switches are connected) the same MAC address may be sent as the source MAC on both aggregators.  This confuses the switch.

    Your case is similar; the bond gets its MAC from slave A (in your case, the "non-LACP port" slave).  It assigns that MAC to all slaves.  The active aggregator ends up being on slave B, using slave A's MAC for the aggregator.  LACPDUs are periodically sent on slave A, also using its permanent MAC address (LACPDUs use the interface's permanent MAC, not the MAC used for the aggregator).  When the LACPDU goes out, the switch updates its mac address table, and sends traffic destined for that MAC to slave A, which is dropped (non-control traffic inbound to inactive slaves is dropped to suppress duplicates).  Once the active aggregator sends something (on slave B), the switch mac address table updates again, and voila, everything works again.

    You can induce the behavior on a single switch if the ports form into multiple aggregators, either because they're grouped separately on the switch or because they're different speeds.

    FreeBSD might be running both aggregators as active simultaneously, it accept traffic inbound to the non-active aggregator, it might not send LACPDUs on the non-active aggregator, it might be using a separate MAC address under the covers and not showing it in ifconfig, or it might be something else.

    The solution for linux is to have each aggregator select a MAC for itself from one of its slaves.

    This is tricky for older network cards, because some drivers do not permit altering the MAC address while the device is up.  The recent (last three or four years, I'd say) drivers can all do this, so it's really an issue only for older devices.