Campus Network, Switches randomly turning orange and going down, or intermittent.

gonzalezgjaime
Conversationalist

Campus Network, Switches randomly turning orange and going down, or intermittent.

Hi, hope everyone is healthy and working!

 

My Company is currently on the final stages of our Meraki deployment.

 

We are currently using different brands and flavors (Mostly 10Gbps. Cisco, HP, Other not so hot brands) of the SFP modules between the MS hardware. A lot of out topology is currently daisy chained, the final stage of this project is to deploy newly installed fiber so our topology will look like so:

 

ABC Campus Meraki Network.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The problem we are experiencing at the moment, is that randomly switches will go offline, and we have to physically power cycle them (which makes me very nervous as this is very expensive equipment) mostly every time this happens. I understand that non-meraki SFP modules can result in CRC errors, or "issues" per the meraki Engineers I have spoken to. However, I have a feeling it could be related to VLAN, or other types of configurations, we may have missed or performed incorrectly. The following is my Addressing & VLANS conf:

 

Screen Shot 2020-05-09 at 5.19.43 PM.png

We are currently using VLAN 1 on all fiber ports as native VLAN, I understand this is not a Best Practice, however could this be causing the switches to randomly turn "orange" and go offline, or intermittent? Also, sometimes a sw may go down, but the downstream switches and APs are still "Green" is this a common occurrence?

 

This is a real time snapshot of out topology (We are not using L3 interfaces, or static routes as of now):

 

Screen Shot 2020-05-09 at 5.23.39 PM.png

 

Please let me know if there is any other context, relevant to troubleshooting this, that I might have missed.

 

I appreciate any feedback. Thank you !!

 

Jaime.

 

EDIT: Including meraki dashboard "Switch Settings" page displaying Root Bridge configuration. Only the MX250 is providing DHCP Service, none of the switches are serving DHCP, static or L3 routes.

 

gonzalezgjaime_0-1589117822811.png

 

5 REPLIES 5
IT_Magician
Building a reputation

Go to network wide and clients, and sort by usage. Do you see large amounts of UDP traffic that could mean a broadcast storm somewhere?

I've had issues like this when using non-Meraki switches in the network when loops are present.  You may be facing an issue with spanning tree rather than SFPs.  On downed links, on the upstream side, see if there are any spanning-tree events.

 

Some critical tips:

  • Make sure the root of the network (typically the core switch) is configured as the spanning-tree root.
  • HP don't use "standard" spanning-tree weights.  So when you have HP switches in a network and any other vendor (such as Meraki) and there is a loop anywhere (even if it is not around the HP switch) spanning tree can calculate an inconsistent state from different places in the network.  Consequently, it is CRITICAL to configure any non-Meraki switch to use "mstp".  On Cisco Enterprise switches you need this single line "spanning-tree mode mst".

@PhilipDAth Thank you for your answer. The high majority of switches are all meraki. There is a 3 MS350-48 Stack in our MDF which is configured as STP Root Bridge. I have edited the original post to display the configuration.

 

I believe we have around 4 or 5 HP switches that are configured as Bridges, none of these are routing of providing DHCP. However I will check if your suggestion for the spanning tree mode can be applied to the HP hardware in question.

 

Thank you

@IT_Magician  Thank you so much for your reply. 

 

Actually, there is!

gonzalezgjaime_0-1589117206025.png

If I dig a bit deeper, I can see that the highest contributor is a meraki device

Screen Shot 2020-05-10 at 6.27.26 AM.png

Is this what you were referring to?

 

Thanks again

GaryShainberg
Building a reputation

I agree with @PhilipDAth but I would also try switching all the uplink ports to trunk ports and allow all vLAN's and see if that make a difference also for the switches that keep having issues have you tried adding a second path, which should "block" initially but if the main uplink fails (for whatever reason) the port should then switch to forward - this should eliminate hardware issues.

 

Regards

 

Gary 

CTO & Solutioneer
CMNA, CMNO, ECMS2
SNSA, SNSP
~~If you found this post helpful, please give it kudos. If my answer solved your problem, click "accept as solution" so that others can benefit from it.~~
Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels