Hello, we have reported this issue to Meraki Support however we are not getting anywhere fast with them!
We are seeing an issue where a port which links to another switch is disabling. In the event log we can see the port undertake a RSTP Role Change and the port change from designated to disabled. This in turn stops any of the downstream switches from being able to communicate.
The issue happens a couple of times a week and tends to happen during the night.
The port is configured as a Trunk port, with Rood Guard enabled.
Anyone got any ideas what could be causing this?
TIA
Stuart
It's more than likely the STP event is a symptom of the actual problem which is the port going down.
What is the port connected to? Do you see anything in the logs on the that connected device?
Hi @Brash, the port is connected to another switch so when the port disables it affect that switch and then the other 3 below that (they are daisy chained). When I look at the second switch and check the event log there is nothing relating to the port it is uplinking too.
Thanks
Stuart
@Stu_F , Whilst you have Root Guard configured on your Core (i assume) what do you have configured on the upstream switches?
Also, have you configured your Bridge Priorities correctly with the Core being the lowest priority?
Switching > Configure > Switch Settings > STP Configuration
@DarrenOC - I thought a drawing might help here. I am thinking from your comments we are maybe missing some settings for the other switches?
Thanks @Stu_F , a drawing always helps.
So from your notes you're stating that Switch 9 is the STP Root as you've configured this with the lowest bridge priority? Are all your SVI's (Layer 3 interfaces) configured on this switch as well? If so i would configure those uplink ports on switch 9 as Root Guard and all others as Loop Guard. I would consider switch 9 as your core?
Thanks @DarrenOC
Yes our Cisco router is connected to Cab 1 Switch 9.
So to confirm you would set the uplink port that the router is connected to as Root Guard and every other uplink port (across all switches) as Loop Guard?
I would say no, you misunderstood @DarrenOC from what I can say.
Below you will find an explanation, but first lets clear something else.
I would start with saying, that the log you showed will occur if you simply unplug any cable, you will see the STP changes on ports, which is expected.
Maybe there is a issue with the connection, I've seen bad SFP or even fibers misbehaving from time to time, and even sometimes switches, which you should bear in mind.
Now the Root/Loop Guard logic:
bang on @JacekJ 👍 - perfect explanation.
Thanks @DarrenOC @JacekJ - apologies for the delay in coming back to you...been on Jury Duty the past 2 days!
Thanks for your explanation....I think this is how we have everything set up APART from the port connecting to the Cisco router. It is set to Loop Guard so I am thinking this should be changed to Root Guard as we do have VLANs running over the connection.
Here is a pic to show how we have the various ports setup - from what I can see this matches your explanation I think!
If you agree this is set up correctly then I think the next steps would be to change the port which connects to the Cisco and then potentially change the cable connecting the two switches where the port is disabling??? Would you agree?
Thanks, Stuart
If the STP root is meant to be the Cab1/Switch9 (as you stated previously) then this is set correctly.
Sure, you can modify that internet uplink port towards the Cisco router to Root Guard, if it doesn't send any STP or is sitting with the priorities on default values way higher than 4096 you will be good, this should not harm.
But to be honest, I'm still not convinced this is a STP issue, because in the initial post you wrote that the issue was between switches, not the switch and the Cisco router.
Go one step back and take a look in the logs what happened on the other side of that connection at the same time, to have the whole picture. Looking at the log you showed, it wasn't STP (on that side), it says disabled which means that the connection is simply gone (in a case of a loop you would see alternate/blocking):
Hi @JacekJ
We did check the logs in Switch 2 and specifically the same time (around 2am) that the port which links the switch 1 & 2 was disabled on Switch 1 and there were no entries.
Here is the log from Switch 2 - the events starting at 6.50am were from when we rebooted the switch to be able for the port to come back online.
Thanks, Stuart
Right, my bad, when the switch lost connectivity, you will have no logs in the dashboard!
Your STP config looks OK so I would check the cabling, maybe swap SFPs, patchcords to see if the issue moves along with them.
In that case, I'd start at layer 1 and look at swapping out patch cables/SFP's on that link.
Thanks - will give that a bash...just seemed strange it would happen in the middle of the night with little to no traffic.