High STP Erros and switch DNS Misconfigure

GIB_NW
Here to help

High STP Erros and switch DNS Misconfigure

Hello All

We are in process of migrating over to a complete Meraki Switch site.  We started about a month ago and have our main areas mainly done already - Close to 60 switches, with about 40 already deployed.
We are upgrading from OLD HP/Aruba switches of various models /ages

 

We have run the approx 40 switches already deployed for about a month but in the last week and a half we have been seeing Very high STP on ports - and then during the day, we start to see switches alert and say DNS is misconfigured.
This is when the network crashes hard and we lose the network across our whole site, i.e no internet no access to servers, etc.

 

We have an MS425-32 set as a core switch with L3 configured - this then connects to other MS425's as distribution switches that have MS225 connected to them.  We also have some MS125-24/48's and MS120-8FP's connected.

We can see various ports on various switches turn orange and give the Very high or High STP errors on the ports.

 

The only way we have found to clear the issue is to reboot the Core, Main Distribution Switch, and also a stacked Switch (that connects to our cisco Blade serves and storage devices that house all our servers)
Once we do this and "Clear: the STP errors the network comes back and the switches return to normal Green connection
We have the switches set for static IP's on Vlan 99 ( our Management VLAN) with DNS set to 8.8.8.8 and IP for our Firewall.

 

Wondering if anyone has seen this before?  and shouldn't Meraki's shut down ports instead of taking the network down, it's like the Meraki will not recover itself without a reboot.
For FYI RSTP is enabled globally and some ports do have root Guard enabled depending on the endpoint connected, with a mix of Trunk and Access Ports 

Sorry for the long-winded explanation 🙂

5 REPLIES 5
Bruce
Kind of a big deal

If you're seeing a number of STP changes then something is occurring in the network that is causing STP to reconverge. If this is happening again and again, then this will cause the switches to be unable to contact their DNS servers (either due to high-CPU, or pure network traffic), so the DNS misconfiguration is likely a secondary symptom.

 

I'd be checking your network for Layer 2 loops and ensuring that STP is appropriately blocking them (ideally you'd remove all the Layer 2 loops and use LACP for redundancy), and finding out where the changes in the network are occurring that are causing a STP reconvergence. You mention Blade Servers, and that would be one of my starting points, especially if they have an switch in the chassis which will be non-Meraki (I've seen plenty of issues here in the past - generally devices that don't running STP and thus create loops in the network if you're not careful).

 

I'd take the time to review the network against the Meraki MS best practice guide, https://documentation.meraki.com/Architectures_and_Best_Practices/Cisco_Meraki_Best_Practice_Design/.... And I'd pay special attention to the Spanning Tree elements of it:

  • Keep the STP diameter under 7 hops, such that packets should not ever have to travel across more than 7 switches to travel from one point of the network to the other
  • BPDU Guard should be enabled on all end-user/server access ports to avoid rogue switch introduction in network
  • Loop Guard should be enabled on trunk ports that are connecting switches
  • Root Guard should be enabled on ports connecting to switches outside of administrative control

Tell us how you go.

cmr
Kind of a big deal
Kind of a big deal

@GIB_NW added to what @Bruce has said, id draw out the connectivity you have on a piece of paper (might need a big piece) to make sure you haven't inadvertently connected switches round in a loop.

GIB_NW
Here to help

Thanks for the suggestions and the link to the best practice

 

We are going through our network and checking ports etc

 

Will keep you posted on the outcome

Update

We did some investigations last night while the network was in error

We have a separate network using Cisco switches and the connection between the two networks is a connection between an MS225 and a Cisco catalyst switch.
According to Meraki If the connection to a Cisco Catalyst switch then you cannot have an MS as the Root Switch
This doesn't work for our set up as the Catalyst cannot be the root for all of our Vlans and main network

Seems the Two Spanning Tree Protocols are not compatible and there are a lot of STP coming from that other network

Why won't Cisco implement Cisco and Meraki compatibility they are the same company 😞

Bruce
Kind of a big deal

Why not use MSTP in the Cisco Catalyst environment (instead of PVST/RPVST)? That is compatible with the Meraki switches (which only run STP/RSTP).

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels