MS Switch weird behavior following abrupt power issue

This2shallpass
Getting noticed

MS Switch weird behavior following abrupt power issue

Hi, 

 

We have the below switches in the topology. The access switches connect to the Core switch using Agg ports.

Core Switch (Stack of 4)  3 - MS225-48 & 1- MS225-48FP
3 access switches - MS125-48FP & MS120-8

All switches are running on firmware 15.22 

Core switch STP bridge priority is set at 8192 & other access switches are on default value 32768.


Recently, there was a power outage at site & 3 access switches were on raw power. The switches went down & came back up but they started behaving weird with DNS errors, unable to fetch configuration, devices showing Red on Dashboard etc.

In the event logs there were just MAC address flapping logs so we thought the there is a loop causing the network to fluctuate. Case raised with TAC & they initially suspected loop but later requested to power off/on the switches & reset one of the switches. Once we did that one switch started functioning properly. But other switches were still in the same state & hence TAC raised RMA.

Now, the strange part - The 2 switches that weren't coming UP (Now RMA'd) suddenly turned green after almost 5-6 hours & are performing well. We will replace those switches anyway.

My understanding is the power issue caused a communication issue between switches and Core stack or may be there was a glitch that got cleared but i am unable to understand why the switches came UP after 5-6 hours.

 

Has anyone faced this issue & can help me understand this problem. Do i need to validate a few settings ?

 

 

Thanks!

4 Replies 4
PhilipDAth
Kind of a big deal
Kind of a big deal

I would arrange a time and reboot the access switches.

 

The switches might have just failed to communicate with the dashboard, and then backed off, and then backed off even more.  I have seen 4-hour backoffs.  When the start backing off, they don't try and communicate until the back off time has been reached.

Is this the expected behavior ?

 

"I would arrange a time and reboot the access switches" - Are you suggesting to reboot the access switches again ?

 

Hello @This2shallpass, it is expected that there would be a backoff time depending on what was happening with the switches. This can happen, especially with firmware changes. If one (or more) of the devices went through a factory reset, they would have to download their firmware. 

kYutobi
Kind of a big deal

Have had some weird issues like that happen to me @This2shallpass @PhilipDAth where it backed off about 3 hours or so with any explanation whatsoever.  

Enthusiast
Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels