RSTP Role Change

Stu_F
Here to help

RSTP Role Change

Hello, we have reported this issue to Meraki Support however we are not getting anywhere fast with them!

 

We are seeing an issue where a port which links to another switch is disabling.  In the event log we can see the port undertake a RSTP Role Change and the port change from designated to disabled.  This in turn stops any of the downstream switches from being able to communicate.

 

The issue happens a couple of times a week and tends to happen during the night.

 

The port is configured as a Trunk port, with Rood Guard enabled.

 

Anyone got any ideas what could be causing this?

 

TIA

 

StuartScreenshot 2023-09-04 113955.jpg

14 Replies 14
Brash
Kind of a big deal
Kind of a big deal

It's more than likely the STP event is a symptom of the actual problem which is the port going down.

What is the port connected to? Do you see anything in the logs on the that connected device?

Stu_F
Here to help

Hi @Brash, the port is connected to another switch so when the port disables it affect that switch and then the other 3 below that (they are daisy chained).  When I look at the second switch and check the event log there is nothing relating to the port it is uplinking too.

 

Thanks

 

Stuart

DarrenOC
Kind of a big deal
Kind of a big deal

@Stu_F , Whilst you have Root Guard configured on your Core (i assume) what do you have configured on the upstream switches?

 

Also, have you configured your Bridge Priorities correctly with the Core being the lowest priority?

 

Switching > Configure > Switch Settings > STP Configuration

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
Stu_F
Here to help

@DarrenOC  - I thought a drawing might help here.  I am thinking from your comments we are maybe missing some settings for the other switches?

Screenshot 2023-09-04 140254.jpg

DarrenOC
Kind of a big deal
Kind of a big deal

Thanks @Stu_F , a drawing always helps.

 

So from your notes you're stating that Switch 9 is the STP Root as you've configured this with the lowest bridge priority?  Are all your SVI's (Layer 3 interfaces) configured on this switch as well?  If so i would configure those uplink ports on switch 9 as Root Guard and all others as Loop Guard.  I would consider switch 9 as your core?

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
Stu_F
Here to help

Thanks @DarrenOC 

 

Yes our Cisco router is connected to Cab 1 Switch 9.

 

So to confirm you would set the uplink port that the router is connected to as Root Guard and every other uplink port (across all switches) as Loop Guard?

JacekJ
Building a reputation

I would say no, you misunderstood @DarrenOC from what I can say.
Below you will find an explanation, but first lets clear something else.

 

I would start with saying, that the log you showed will occur if you simply unplug any cable, you will see the STP changes on ports, which is expected.

Maybe there is a issue with the connection, I've seen bad SFP or even fibers misbehaving from time to time, and even sometimes switches, which you should bear in mind.

 

Now the Root/Loop Guard logic:

  1. As a rule of thumb, on a root switch you set all ports facing towards any other switches to Root Guard, so he will always remain root, if anything wants to take over, then it will not allow to do that and cut it off.
  2. The port facing towards the cisco router - it depends, if it has a simple setup without any vlans, stp involved (access port), then set a BPDU Guard, if not, Root Guard will be OK.
  3. Loop Guard you set up on the other side of the downstream switches that are facing towards the root switch, so basically from one side of the link you set a Root Guard and from the other side a Loop Guard.
  4. This applies also to switches that are daisy chained, so if cabinet 0 switches are connected to cabinet 2 then:
    on the switch in cabinet 2, on the port connecting to the switch in cabinet 0 you set a Root Guard and on the other end (from cabinet 0 perspective) you set a Loop Guard (as per the logic in the end of point 3 above)

https://documentation.meraki.com/MS/Port_and_VLAN_Configuration/Configuring_Spanning_Tree_on_Meraki_...

DarrenOC
Kind of a big deal
Kind of a big deal

bang on @JacekJ 👍 - perfect explanation.

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
Stu_F
Here to help

Thanks @DarrenOC @JacekJ  - apologies for the delay in coming back to you...been on Jury Duty the past 2 days!

Thanks for your explanation....I think this is how we have everything set up APART from the port connecting to the Cisco router.  It is set to Loop Guard so I am thinking this should be changed to Root Guard as we do have VLANs running over the connection.

Here is a pic to show how we have the various ports setup - from what I can see this matches your explanation I think!

Stu_F_0-1694087672469.png

 

If you agree this is set up correctly then I think the next steps would be to change the port which connects to the Cisco and then potentially change the cable connecting the two switches where the port is disabling???  Would you agree?

 

Thanks, Stuart

JacekJ
Building a reputation

If the STP root is meant to be the Cab1/Switch9 (as you stated previously) then this is set correctly.

Sure, you can modify that internet uplink port towards the Cisco router to Root Guard, if it doesn't send any STP or is sitting with the priorities on default values way higher than 4096 you will be good, this should not harm.

 

But to be honest, I'm still not convinced this is a STP issue, because in the initial post you wrote that the issue was between switches, not the switch and the Cisco router.

 

Go one step back and take a look in the logs what happened on the other side of that connection at the same time, to have the whole picture. Looking at the log you showed, it wasn't STP (on that side), it says disabled which means that the connection is simply gone (in a case of a loop you would see alternate/blocking):

https://documentation.meraki.com/MS/Monitoring_and_Reporting/MS_Event_Log_Entries_and_Definitions#Po...

Stu_F
Here to help

Hi @JacekJ 

 

We did check the logs in Switch 2 and specifically the same time (around 2am) that the port which links the switch 1 & 2 was disabled on Switch 1 and there were no entries.

Here is the log from Switch 2 - the events starting at 6.50am were from when we rebooted the switch to be able for the port to come back online.

Stu_F_0-1694098524475.png

Thanks, Stuart

JacekJ
Building a reputation

Right, my bad, when the switch lost connectivity, you will have no logs in the dashboard!

Your STP config looks OK so I would check the cabling, maybe swap SFPs, patchcords to see if the issue moves along with them.

Brash
Kind of a big deal
Kind of a big deal

In that case, I'd start at layer 1 and look at swapping out patch cables/SFP's on that link.

Stu_F
Here to help

Thanks - will give that a bash...just seemed strange it would happen in the middle of the night with little to no traffic.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels