Hello All
We are in process of migrating over to a complete Meraki Switch site. We started about a month ago and have our main areas mainly done already - Close to 60 switches, with about 40 already deployed.
We are upgrading from OLD HP/Aruba switches of various models /ages
We have run the approx 40 switches already deployed for about a month but in the last week and a half we have been seeing Very high STP on ports - and then during the day, we start to see switches alert and say DNS is misconfigured.
This is when the network crashes hard and we lose the network across our whole site, i.e no internet no access to servers, etc.
We have an MS425-32 set as a core switch with L3 configured - this then connects to other MS425's as distribution switches that have MS225 connected to them. We also have some MS125-24/48's and MS120-8FP's connected.
We can see various ports on various switches turn orange and give the Very high or High STP errors on the ports.
The only way we have found to clear the issue is to reboot the Core, Main Distribution Switch, and also a stacked Switch (that connects to our cisco Blade serves and storage devices that house all our servers)
Once we do this and "Clear: the STP errors the network comes back and the switches return to normal Green connection
We have the switches set for static IP's on Vlan 99 ( our Management VLAN) with DNS set to 8.8.8.8 and IP for our Firewall.
Wondering if anyone has seen this before? and shouldn't Meraki's shut down ports instead of taking the network down, it's like the Meraki will not recover itself without a reboot.
For FYI RSTP is enabled globally and some ports do have root Guard enabled depending on the endpoint connected, with a mix of Trunk and Access Ports
Sorry for the long-winded explanation 🙂
If you're seeing a number of STP changes then something is occurring in the network that is causing STP to reconverge. If this is happening again and again, then this will cause the switches to be unable to contact their DNS servers (either due to high-CPU, or pure network traffic), so the DNS misconfiguration is likely a secondary symptom.
I'd be checking your network for Layer 2 loops and ensuring that STP is appropriately blocking them (ideally you'd remove all the Layer 2 loops and use LACP for redundancy), and finding out where the changes in the network are occurring that are causing a STP reconvergence. You mention Blade Servers, and that would be one of my starting points, especially if they have an switch in the chassis which will be non-Meraki (I've seen plenty of issues here in the past - generally devices that don't running STP and thus create loops in the network if you're not careful).
I'd take the time to review the network against the Meraki MS best practice guide, https://documentation.meraki.com/Architectures_and_Best_Practices/Cisco_Meraki_Best_Practice_Design/.... And I'd pay special attention to the Spanning Tree elements of it:
Tell us how you go.
Thanks for the suggestions and the link to the best practice
We are going through our network and checking ports etc
Will keep you posted on the outcome
Update
We did some investigations last night while the network was in error
We have a separate network using Cisco switches and the connection between the two networks is a connection between an MS225 and a Cisco catalyst switch.
According to Meraki If the connection to a Cisco Catalyst switch then you cannot have an MS as the Root Switch
This doesn't work for our set up as the Catalyst cannot be the root for all of our Vlans and main network
Seems the Two Spanning Tree Protocols are not compatible and there are a lot of STP coming from that other network
Why won't Cisco implement Cisco and Meraki compatibility they are the same company 😞
Why not use MSTP in the Cisco Catalyst environment (instead of PVST/RPVST)? That is compatible with the Meraki switches (which only run STP/RSTP).