I have a somewhat big setup and the upgrade went terribly wrong, nothing I'm not used when it comes to Meraki switches (sad, but true) - this will be a long story 😉 I have 4x MS425-32 switches in a stack + a number of MS225-24/48 in different combinations (some are single, some are 2pcs stacks and some are 3pcs or more which is very important in the Meraki world) - overall ~40 Meraki switches + ~10 WS2960X. All "remote" switches are connected to 2 of the core switches using LACP, so no SPOF. I set up a staged upgrade where I first wanted to upgrade the core switches (they are handling all L3 as in your setup), then 2 pure management switches in server rooms and then the rest of the remote switches. Here is what happened: cores upgraded nice, everything green, all ports up, BUT as in your story I had something between 10-20 switches behind the cores offline and not handling traffic despite all ports being up and showing no issues so the upgrade got stuck and I needed to react first I rebooted all cores from the dashboard - didn't change a thing then, since I have a long story with the Meraki switches I started disabling all redundant connections towards the remote switches (so basically all ports on core 02 and 04 disabled, only stacking was up) after that almost all switches started regaining connectivity and started upgrading by saying almost I mean that there is a story if you have more than 2 switches in a stack and a cross stack LACP connection to the cores they tend to "break", some of the switches in a stack will be offline, some online, the best resolution to that is remove all redundant connections (I disabled that from core side) and then reboot one of the switches that is online, if this doesn't help you pick the next one and next one and this will eventually kick in. If you are onsite, then just remove redundant connections and reboot the whole stack. I assume that rebooting the master switch would help right away, but I didn't know that the dashboard shows the info, at least on the latest firmware I waited some time and observed the current firmware version on the dashboard (it will show you if its "not current" or MS 15.21.1) after everything upgraded I rebooted the cores once again and then I started enabling the redundant ports and everything is working as expected since then (two weeks ago) So this was a journey which took me over 2 hours to fix remotely, I even didn't bother to raise a case because I knew that playing with ports and rebooting will work and also I planned to do a full reboot anyway on the cores because I have low trust in the firmware upgrade process (bad experience). If you have any questions - go ahead 😉
... View more