Major MS390 Issues After Firmware Update

SwitchedAtBirth
New here

Major MS390 Issues After Firmware Update

3 Weeks ago, Meraki pushed out firmware updates to our switches, upgrading from MS 14.33.1 to MS 15.22 for our MS210's and CS 15.21.1 for our MS390's. Since the second the update was pushed the nightmare began and is still on-going. 

The update failed on 2 of our switches and sent the remainder of them into a non-stop reboot cycle. After working with Meraki, we rolled back the update and the issues were still present along with now, many POE Overload issues on multiple ports. Mind you, none of these issues existed before the firmware push. We attempted the update again and the same 2 switches would not update, and the rest continued on a perpetual reboot cycle. 

After over a week of troubleshooting, they RMA'ed the 2 that would not update. We replaced the switches and ran the updates...They updated, time to celebrate, right? No, the stacks continued to go offline/on-line. So, we rolled back the update for a 3rd time. We also did some research and found that some folks were complaining about STP Loop Guard can be known to cause issues. We found this enabled on 2 uplinks, disabled and finally, all was well. Meraki suggested now that things were stable, to run the update again. We did, and they updated and 10 min later all started going offline again. This morning we rolled back the update for the 4th time, still having POE overload issues and 1 stack is down, that we can't manually reboot right now because we have an event going on in the office and a manual reboot would take down the lights....again. For 3 weeks we have been getting the stink eye from users because every time these switches go down, the iOT lighting in the office goes down with them. 

Over the past 3 weeks our offices have been a horror movie because batches of lights dim /go down/cycle and come back at random with the switches constantly cycling. Has anyone else experienced these issues with the recent update on the MS390s? This has been a nightmare and I just want to wake up. 

5 Replies 5
DarrenOC
Kind of a big deal
Kind of a big deal

Hi @SwitchedAtBirth - love the username.

 

Sorry to hear that you’ve been suffering with these issues.  It’s not your fault, just remember that especially when you’re getting it in the neck from the higher ups.  You’re doing what you can within the constraints of the tech involved.

 

The nightmare will end at some point.

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
PhilipDAth
Kind of a big deal
Kind of a big deal

That sucks!  I feel for you.

 

Was it only the MS390s having the issue?

 

You probably don't want to entertain this - but you could try breaking things to make the issue simpler.  For example, if you disconnect every other switch from the MS390's apart from their upstream does the upgrade work?

Or you could try breaking a single MS390 out of the stack - does it upgrade ok in standalone mode?

 

A horrible situation.

SwitchedAtBirth
New here

Yeah, only the MS390's the 210's have been fine. The update applies to all now that the 2 switches that wouldn't update have been replaced. But, right after the update applies all the 390's go down and up over and over. Not a sight I want to see any more: 

SwitchedAtBirth_0-1694033255582.png

 

cmr
Kind of a big deal
Kind of a big deal

@SwitchedAtBirth personally I'd try the current CS16.3 beta as the release trains were separated to allow Catalyst hardware to get more fixes.

JacekJ
Building a reputation

First of all, it really sounds like a nightmare, I really feel bad for you!

 

You can read over here my lengthy complaint and story about the upgrade from MS14 to MS15:

https://community.meraki.com/t5/Switching/Has-anyone-had-issues-with-Stacks-and-LACP-after-going-fro... 

 

There we also had issues with switches being unable to upgrade because the ports were blocked. But I handled that by disabling all redundant connections (which in my case was simply disabling almost all ports on one of core switches, thanks to a rather clean cable setup), then everything regained connectivity and upgraded.

Your comment below got me thinking and I will keep that in my mind during next upgrades, since I have the Loop Guard set on all ports pointing towards the core switches (stp root), and maybe that was the reason for my issues.

Thanks!


@SwitchedAtBirth wrote:

We also did some research and found that some folks were complaining about STP Loop Guard can be known to cause issues. We found this enabled on 2 uplinks, disabled and finally, all was well.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels