Campus Network, Switches randomly turning orange and going down, or intermittent.
Hi, hope everyone is healthy and working!
My Company is currently on the final stages of our Meraki deployment.
We are currently using different brands and flavors (Mostly 10Gbps. Cisco, HP, Other not so hot brands) of the SFP modules between the MS hardware. A lot of out topology is currently daisy chained, the final stage of this project is to deploy newly installed fiber so our topology will look like so:
The problem we are experiencing at the moment, is that randomly switches will go offline, and we have to physically power cycle them (which makes me very nervous as this is very expensive equipment) mostly every time this happens. I understand that non-meraki SFP modules can result in CRC errors, or "issues" per the meraki Engineers I have spoken to. However, I have a feeling it could be related to VLAN, or other types of configurations, we may have missed or performed incorrectly. The following is my Addressing & VLANS conf:
We are currently using VLAN 1 on all fiber ports as native VLAN, I understand this is not a Best Practice, however could this be causing the switches to randomly turn "orange" and go offline, or intermittent? Also, sometimes a sw may go down, but the downstream switches and APs are still "Green" is this a common occurrence?
This is a real time snapshot of out topology (We are not using L3 interfaces, or static routes as of now):
Please let me know if there is any other context, relevant to troubleshooting this, that I might have missed.
I appreciate any feedback. Thank you !!
EDIT: Including meraki dashboard "Switch Settings" page displaying Root Bridge configuration. Only the MX250 is providing DHCP Service, none of the switches are serving DHCP, static or L3 routes.
I've had issues like this when using non-Meraki switches in the network when loops are present. You may be facing an issue with spanning tree rather than SFPs. On downed links, on the upstream side, see if there are any spanning-tree events.
Some critical tips:
Make sure the root of the network (typically the core switch) is configured as the spanning-tree root.
HP don't use "standard" spanning-tree weights. So when you have HP switches in a network and any other vendor (such as Meraki) and there is a loop anywhere (even if it is not around the HP switch) spanning tree can calculate an inconsistent state from different places in the network. Consequently, it is CRITICAL to configure any non-Meraki switch to use "mstp". On Cisco Enterprise switches you need this single line "spanning-tree mode mst".
@PhilipDAth Thank you for your answer. The high majority of switches are all meraki. There is a 3 MS350-48 Stack in our MDF which is configured as STP Root Bridge. I have edited the original post to display the configuration.
I believe we have around 4 or 5 HP switches that are configured as Bridges, none of these are routing of providing DHCP. However I will check if your suggestion for the spanning tree mode can be applied to the HP hardware in question.
I agree with @PhilipDAth but I would also try switching all the uplink ports to trunk ports and allow all vLAN's and see if that make a difference also for the switches that keep having issues have you tried adding a second path, which should "block" initially but if the main uplink fails (for whatever reason) the port should then switch to forward - this should eliminate hardware issues.
CTO & Solutioneer CMNA, CMNO, ECMS2 SNSA, SNSP ~~If you found this post helpful, please give it kudos. If my answer solved your problem, click "accept as solution" so that others can benefit from it.~~