Menu

(no subject)

Rajani
2013-06-24
2013-07-04
  • Rajani

    Rajani - 2013-06-24

    Hello Vitalii,

    I am facing an issue while changing switch priority from lower value to higher value for MSTP Instance.

    Steps to reproduce the issue:

    1) Connect two switches in a loop and one switch having loop backed as below.

           +++++                     +++++          
     _____| sw1 |___________________| sw2 |
    |_____| (f3)|___________________| (f2)|
           +++++                     +++++
    

    2) Below are the commands used for configuration on both switches,
    mstpd
    mstpctl addbridge br1
    mstpctl createtree br1 1
    mstpctl setmstconfid br1 0 reg1
    mstpctl setvid2fid br1 100:10
    mstpctl setfid2mstid br1 1:100

    3) With default Bridge Priority values and with lower MAC (f2), SW2 is the CIST Root Bridge.
    mstpctl showtree br1 0.

    4) Then make the SW1 as the MSTI Root for Instance 1 by reducing the priority of
    Instance 1 on SW1.
    mstpctl settreeprio br1 1 4.
    mstpctl showtree br1 1 --> for MST Instance 1.

    5) Change the switch priority of Instance 1 of SW2 as 14 by using,
    mstpctl settreeprio br1 1 14.

    6) Change the bridge priority value of Instance 1 on SW1 to greater than its previous
    value and lesser than Instance 1's priority of SW2.
    mstpctl settreeprio br1 1 9.

    7) After the step 6, SW1 should remain as the Regional root of MSTI Instance 1, with
    bridge priority 9 instead of 4. But we are able to observe SW1 as the root of MSTI
    Instance '1' with 4 as the Bridge Priority value, which is wrong (refer the
    attachment).

     
  • Rajani

    Rajani - 2013-06-24

    attachment for the above issue.

     
    • Vitalii Demianets

      Yep, there is an issue here, I can reproduce it. The topology doesn't converge for some reason; you can see that designated internal path cost is constantly increasing (by 400000 every 2 seconds).
      Thanks for reporting, I'm investigating it.

      Vitalii

       
    • Vitalii Demianets

      Hello Rajani!
      I think you have encountered the classic "count-to-infinity" problem (see, for example, this excellent explanation), when the old info ("Regional Root is SW1 with priority 4") circulates in the loop infinitely. It should be aged out by the means of remainingHops counter, but it won't. I think this is a bug in standard.
      Please try the following patch and tell me if it solves the issue:

      Index: mstp.c
      ===================================================================
      --- mstp.c  (revision 49)
      +++ mstp.c  (working copy)
      @@ -2430,8 +2430,32 @@
           unsigned int Max_Age = cist->portTimes.Max_Age;
           unsigned int Hello_Time = cist->portTimes.Hello_Time;
      
      +    /* NOTE: 802.1Q-2005(-2011) says that we should use
      +     *  "remainingHops ... from the CIST’s portTimes parameter"
      +     *  As for me this is clear oversight in the standard,
      +     *  the remainingHops should be taken form the port's own portTimes,
      +     *  not from CIST's. After all, if we don't use port's own
      +     *  remainingHops here, they aren't used anywhere at all.
      +     *  Besides, there is a scenario which breaks if we use CIST's
      +     *  remainingHops here:
      +     *   1) Connect two switches (SW1,SW2) with two ports, thus forming a loop
      +     *   2) Configure them to be in the same region, with two trees:
      +     *      0 (CIST) and 1.
      +     *   3) at SW1# mstpctl settreeprio br0 1 4
      +     *      SW1 becomes regional root in tree 1
      +     *   4) at SW2# mstpctl settreeprio br0 1 14
      +     *   5) at SW1# mstpctl settreeprio br0 1 9
      +     *
      +     *  And now we have the classic "count-to-infinity" problem when the old
      +     *  info ("Regional Root is SW1 with priority 4") circulates in the loop,
      +     *  because it is better than current info ("Regional Root is SW1 with
      +     *  priority 9"). The only way to get rid of that old info is
      +     *  to age it out by the means of remainingHops counter.
      +     *  In this situation we certainly must use counter from tree 1,
      +     *  not CIST's.
      +     */
           if((!prt->rcvdInternal && ((Message_Age + 1) <= Max_Age))
      -       || (prt->rcvdInternal && (cist->portTimes.remainingHops > 1))
      +       || (prt->rcvdInternal && (ptp->portTimes.remainingHops > 1))
             )
               ptp->rcvdInfoWhile = 3 * Hello_Time;
           else
      

      Vitalii

       

      Last edit: Vitalii Demianets 2013-06-26
  • Rajani

    Rajani - 2013-06-27

    Still facing the same issue with the attached patch.
    We are able to face one more issue that if we connect back-up loop for SW1, the port states are incosistent.

    I observed that in calculation of the port priority vector and Msg priority vector of SW1, it treats 4 is better than 9. So copying this value 4 to designated priority vector and sends this value as bridge priority in BPDU.

    Rajani

     
    • Vitalii Demianets

      Still facing the same issue with the attached patch.

      Hmm, strange, I was positively sure that I found the source of the problen.

      We are able to face one more issue that if we connect back-up loop for SW1, the port states are incosistent.

      That's because of the old info ("SW1 has prio 4") circulating in the loops.

      I observed that in calculation of the port priority vector and Msg priority vector of SW1, it treats 4 is better than 9. So copying this value 4 to designated priority vector and sends this value as bridge priority in BPDU.

      Exactly! That's how the old outdated information ("SW1 has prio 4") keeps living in the network. And the only reliable method to get rid of it is to age it out by decrementing remainingHops counter.

      Are you sure you have applied "count-to-infinity.patch" to both swithces, SW1 and SW2?

      Also, please try the following logging and tell me what you get in logs:

      Index: mstp.c
      ===================================================================
      --- mstp.c  (revision 53)
      +++ mstp.c  (working copy)
      @@ -1919,7 +1919,21 @@
                         )
                     )
                 )
      +        {
      +            if((0 != ptp->MSTID) && msg_Better_port
      +               && (0 == memcmp(mPri->RRootID.s.mac_address,
      +                           ptp->tree->BridgePriority.RRootID.s.mac_address,
      +                           ETH_ALEN)
      +                  )
      +              )
      +                INFO_MSTINAME(prt->bridge, prt, ptp,
      +                              "Received Superior Designated Info (pri = %u) "
      +                              "with the same MAC, remaining hops = %hhu\n",
      +                              (unsigned)GET_PRIORITY_FROM_IDENTIFIER(mPri->RRootID) >> 4,
      +                              mTimes->remainingHops
      +                             );
                   return SuperiorDesignatedInfo;
      +        }
      
               /* a).2) */
               /* We already know that msgPriority _IS_NOT_BETTER_than portPriority.
      

      Vitalii

       
  • Rajani

    Rajani - 2013-07-02

    Yes. I have applied the patch on both switches SW1 and SW2 and tested.
    Along with the count_to_infinity.patch, log.patch is also included and tested.
    Observed the remaining hop counter getting decremented and Captured logs are as below:

     
    • Vitalii Demianets

      Hello Rajani!
      Thanks for the testing!

      Here are some issues in your logs:

      1) Please, make sure that you are using the latest revision. For example, I've noticed that you have no this patch applied (you have a lot of "Attempt to send from port with Disabled role" messages);
      2) Please, set debuglevel to 3 ("mstpctl debuglevel 3") before starting experiments, we'll get more debug info in logs.
      3) Please, send me updated logs (with the latest revision [r54] and two above patches applied, "count-to-infinity.patch" and "log.patch")
      4) Also, I've noticed that you have a lot of up/down transition on ports. Is it intentional? Had you really disconnected and connected again physical ports? If you hadn’t unplugged the cables then the ports shouldn’t go up and down on their own. Maybe that's the root issue?

      Vitalii

       

      Related

      Commit: [r49]
      Commit: [r54]

  • Rajani

    Rajani - 2013-07-03

    Hello Vitalli,

    Thanks for the count-to-infinity patch.
    With the latest mstpd changes, the patch works fine.
    We are continuing with whole system testing.

    Thanks again,
    Rajani

     
    • Vitalii Demianets

      Applied as revision [r55]. Thanks for all the testing and finding the bug!

      Vitalii

      PS: Can you give me your full name and (if you want) email address so that I could mention them properly in "Tested-by" and other tags?

       

      Related

      Commit: [r55]

      • Rajani

        Rajani - 2013-07-04

        It is Rajani Ankaiah.
        My Email ID is rajania@tataelxsi.co.in

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.