2) Below are the commands used for configuration on both switches,
mstpd
mstpctl addbridge br1
mstpctl createtree br1 1
mstpctl setmstconfid br1 0 reg1
mstpctl setvid2fid br1 100:10
mstpctl setfid2mstid br1 1:100
3) With default Bridge Priority values and with lower MAC (f2), SW2 is the CIST Root Bridge.
mstpctl showtree br1 0.
4) Then make the SW1 as the MSTI Root for Instance 1 by reducing the priority of
Instance 1 on SW1.
mstpctl settreeprio br1 1 4.
mstpctl showtree br1 1 --> for MST Instance 1.
5) Change the switch priority of Instance 1 of SW2 as 14 by using,
mstpctl settreeprio br1 1 14.
6) Change the bridge priority value of Instance 1 on SW1 to greater than its previous
value and lesser than Instance 1's priority of SW2.
mstpctl settreeprio br1 1 9.
7) After the step 6, SW1 should remain as the Regional root of MSTI Instance 1, with
bridge priority 9 instead of 4. But we are able to observe SW1 as the root of MSTI
Instance '1' with 4 as the Bridge Priority value, which is wrong (refer the
attachment).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yep, there is an issue here, I can reproduce it. The topology doesn't converge for some reason; you can see that designated internal path cost is constantly increasing (by 400000 every 2 seconds).
Thanks for reporting, I'm investigating it.
Vitalii
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Rajani!
I think you have encountered the classic "count-to-infinity" problem (see, for example, this excellent explanation), when the old info ("Regional Root is SW1 with priority 4") circulates in the loop infinitely. It should be aged out by the means of remainingHops counter, but it won't. I think this is a bug in standard.
Please try the following patch and tell me if it solves the issue:
Index: mstp.c===================================================================--- mstp.c (revision 49)+++ mstp.c (working copy)@@ -2430,8 +2430,32 @@
unsigned int Max_Age = cist->portTimes.Max_Age;
unsigned int Hello_Time = cist->portTimes.Hello_Time;
+ /* NOTE: 802.1Q-2005(-2011) says that we should use+ * "remainingHops ... from the CIST’s portTimes parameter"+ * As for me this is clear oversight in the standard,+ * the remainingHops should be taken form the port's own portTimes,+ * not from CIST's. After all, if we don't use port's own+ * remainingHops here, they aren't used anywhere at all.+ * Besides, there is a scenario which breaks if we use CIST's+ * remainingHops here:+ * 1) Connect two switches (SW1,SW2) with two ports, thus forming a loop+ * 2) Configure them to be in the same region, with two trees:+ * 0 (CIST) and 1.+ * 3) at SW1# mstpctl settreeprio br0 1 4+ * SW1 becomes regional root in tree 1+ * 4) at SW2# mstpctl settreeprio br0 1 14+ * 5) at SW1# mstpctl settreeprio br0 1 9+ *+ * And now we have the classic "count-to-infinity" problem when the old+ * info ("Regional Root is SW1 with priority 4") circulates in the loop,+ * because it is better than current info ("Regional Root is SW1 with+ * priority 9"). The only way to get rid of that old info is+ * to age it out by the means of remainingHops counter.+ * In this situation we certainly must use counter from tree 1,+ * not CIST's.+ */
if((!prt->rcvdInternal && ((Message_Age + 1) <= Max_Age))
- || (prt->rcvdInternal && (cist->portTimes.remainingHops > 1))+ || (prt->rcvdInternal && (ptp->portTimes.remainingHops > 1))
)
ptp->rcvdInfoWhile = 3 * Hello_Time;
else
Still facing the same issue with the attached patch.
We are able to face one more issue that if we connect back-up loop for SW1, the port states are incosistent.
I observed that in calculation of the port priority vector and Msg priority vector of SW1, it treats 4 is better than 9. So copying this value 4 to designated priority vector and sends this value as bridge priority in BPDU.
Rajani
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Still facing the same issue with the attached patch.
Hmm, strange, I was positively sure that I found the source of the problen.
We are able to face one more issue that if we connect back-up loop for SW1, the port states are incosistent.
That's because of the old info ("SW1 has prio 4") circulating in the loops.
I observed that in calculation of the port priority vector and Msg priority vector of SW1, it treats 4 is better than 9. So copying this value 4 to designated priority vector and sends this value as bridge priority in BPDU.
Exactly! That's how the old outdated information ("SW1 has prio 4") keeps living in the network. And the only reliable method to get rid of it is to age it out by decrementing remainingHops counter.
Are you sure you have applied "count-to-infinity.patch" to both swithces, SW1 and SW2?
Also, please try the following logging and tell me what you get in logs:
Index: mstp.c===================================================================--- mstp.c (revision 53)+++ mstp.c (working copy)@@ -1919,7 +1919,21 @@
)
)
)
+ {+ if((0 != ptp->MSTID) && msg_Better_port+ && (0 == memcmp(mPri->RRootID.s.mac_address,+ ptp->tree->BridgePriority.RRootID.s.mac_address,+ ETH_ALEN)+ )+ )+ INFO_MSTINAME(prt->bridge, prt, ptp,+ "Received Superior Designated Info (pri = %u) "+ "with the same MAC, remaining hops = %hhu\n",+ (unsigned)GET_PRIORITY_FROM_IDENTIFIER(mPri->RRootID) >> 4,+ mTimes->remainingHops+ );
return SuperiorDesignatedInfo;
+ }
/* a).2) */
/* We already know that msgPriority _IS_NOT_BETTER_than portPriority.
Yes. I have applied the patch on both switches SW1 and SW2 and tested.
Along with the count_to_infinity.patch, log.patch is also included and tested.
Observed the remaining hop counter getting decremented and Captured logs are as below:
1) Please, make sure that you are using the latest revision. For example, I've noticed that you have no this patch applied (you have a lot of "Attempt to send from port with Disabled role" messages);
2) Please, set debuglevel to 3 ("mstpctl debuglevel 3") before starting experiments, we'll get more debug info in logs.
3) Please, send me updated logs (with the latest revision [r54] and two above patches applied, "count-to-infinity.patch" and "log.patch")
4) Also, I've noticed that you have a lot of up/down transition on ports. Is it intentional? Had you really disconnected and connected again physical ports? If you hadn’t unplugged the cables then the ports shouldn’t go up and down on their own. Maybe that's the root issue?
Hello Vitalii,
I am facing an issue while changing switch priority from lower value to higher value for MSTP Instance.
Steps to reproduce the issue:
1) Connect two switches in a loop and one switch having loop backed as below.
2) Below are the commands used for configuration on both switches,
mstpd
mstpctl addbridge br1
mstpctl createtree br1 1
mstpctl setmstconfid br1 0 reg1
mstpctl setvid2fid br1 100:10
mstpctl setfid2mstid br1 1:100
3) With default Bridge Priority values and with lower MAC (f2), SW2 is the CIST Root Bridge.
mstpctl showtree br1 0.
4) Then make the SW1 as the MSTI Root for Instance 1 by reducing the priority of
Instance 1 on SW1.
mstpctl settreeprio br1 1 4.
mstpctl showtree br1 1 --> for MST Instance 1.
5) Change the switch priority of Instance 1 of SW2 as 14 by using,
mstpctl settreeprio br1 1 14.
6) Change the bridge priority value of Instance 1 on SW1 to greater than its previous
value and lesser than Instance 1's priority of SW2.
mstpctl settreeprio br1 1 9.
7) After the step 6, SW1 should remain as the Regional root of MSTI Instance 1, with
bridge priority 9 instead of 4. But we are able to observe SW1 as the root of MSTI
Instance '1' with 4 as the Bridge Priority value, which is wrong (refer the
attachment).
attachment for the above issue.
Yep, there is an issue here, I can reproduce it. The topology doesn't converge for some reason; you can see that designated internal path cost is constantly increasing (by 400000 every 2 seconds).
Thanks for reporting, I'm investigating it.
Vitalii
Hello Rajani!
I think you have encountered the classic "count-to-infinity" problem (see, for example, this excellent explanation), when the old info ("Regional Root is SW1 with priority 4") circulates in the loop infinitely. It should be aged out by the means of remainingHops counter, but it won't. I think this is a bug in standard.
Please try the following patch and tell me if it solves the issue:
Vitalii
Last edit: Vitalii Demianets 2013-06-26
Still facing the same issue with the attached patch.
We are able to face one more issue that if we connect back-up loop for SW1, the port states are incosistent.
I observed that in calculation of the port priority vector and Msg priority vector of SW1, it treats 4 is better than 9. So copying this value 4 to designated priority vector and sends this value as bridge priority in BPDU.
Rajani
Hmm, strange, I was positively sure that I found the source of the problen.
That's because of the old info ("SW1 has prio 4") circulating in the loops.
Exactly! That's how the old outdated information ("SW1 has prio 4") keeps living in the network. And the only reliable method to get rid of it is to age it out by decrementing remainingHops counter.
Are you sure you have applied "count-to-infinity.patch" to both swithces, SW1 and SW2?
Also, please try the following logging and tell me what you get in logs:
Vitalii
Yes. I have applied the patch on both switches SW1 and SW2 and tested.
Along with the count_to_infinity.patch, log.patch is also included and tested.
Observed the remaining hop counter getting decremented and Captured logs are as below:
Hello Rajani!
Thanks for the testing!
Here are some issues in your logs:
1) Please, make sure that you are using the latest revision. For example, I've noticed that you have no this patch applied (you have a lot of "Attempt to send from port with Disabled role" messages);
2) Please, set debuglevel to 3 ("mstpctl debuglevel 3") before starting experiments, we'll get more debug info in logs.
3) Please, send me updated logs (with the latest revision [r54] and two above patches applied, "count-to-infinity.patch" and "log.patch")
4) Also, I've noticed that you have a lot of up/down transition on ports. Is it intentional? Had you really disconnected and connected again physical ports? If you hadn’t unplugged the cables then the ports shouldn’t go up and down on their own. Maybe that's the root issue?
Vitalii
Related
Commit: [r49]
Commit: [r54]
screen shot of the issue.
Hello Vitalli,
Thanks for the count-to-infinity patch.
With the latest mstpd changes, the patch works fine.
We are continuing with whole system testing.
Thanks again,
Rajani
Applied as revision [r55]. Thanks for all the testing and finding the bug!
Vitalii
PS: Can you give me your full name and (if you want) email address so that I could mention them properly in "Tested-by" and other tags?
Related
Commit: [r55]
It is Rajani Ankaiah.
My Email ID is rajania@tataelxsi.co.in