I have two MS435's in a stack in my production network and I need to perform an upgrade of the firmware. Unfortunately if I do the upgrade I have to reboot both at the same time and that can't happen without shutting everything down. So I think the best solution would be to break the stack and let them run as separate switches. Is this the recommended way to go and if so is there documentation (official or not) on how to do this?
@CMDRTucker you can schedule to upgrade them both and then use the beta staged upgrades feature. I have had mostly success but occasional failure. It should work for just a stacked pair and if it does fail all you need to do is pull the power on the one that fails and replug. Much less downtime than splitting a stack!
Cool. I thought I had done this years ago and it worked, but in all the documentation it says it can not be done. I have a DR site similar to this that I will test this on. Thanks for your response.
i test this workaround today and i must say....it works..i upgraded the first switch and plan the other upgrade 50 minutes later.
i test it with a host they connected to this stack via lacp. the host was always availibe
i have 2 stack with 2 MS425 for redundancy. all other switches are connected via stp to this two stack.
now, i think i can create a stack with 4 MS425.
@the-iot I would be less confident of success with larger stacks, I will be doing further testing of a stack of 4 hopefully next week and will post the results here.
So it worked but it was disruptive. I lost about 50 packets in a continuous ping to a server that was running on that DR Storage device. I am concerned that it might be too long of a disruption on the production side if the VMs lose connectivity to the storage for that long. I may just look at removing a member from the stack and separating. The devices plugged into the switch are mirrors of each other.
How far apart did you stage them? I usually upgrade the (notionally) secondary switch first and then the other half an hour later?
I was wondering that as well. I did it 30 minutes apart and I saw in the documentation that it takes 20 minutes to kick off the reboot. I am wondering if I should have waited an hour.
Actually that could explain the time I had an issue (with a stack of 355Xs), that was definitely half an hour apart and it did end up being more like 10 minutes apart at most. I agree an hour would be worth a try as if this can work reliably then it is a game changer for data centres.
What I do for customers that require 24x7 HA (such as running storage on it), is not to stack the switches. Instead, I put each switch into a separate network. It ends up looking like Fibre channel - two separate fabrics. You can't use Etherchannel in this configuration - your devices have to support NIC failover.
I also set each of these two networks to have different maintenance windows, so that the automatic Meraki upgrades happen on different days.
@PhilipDAth based on Meraki this is probably the "best" solution from a technical point of view, but which functionalities do you lose here or how easy is the troubleshooting? I think that Meraki equipment in such constellations is not the cleanest solution...
what I ask myself - is there a recommended behaviour how an existing MS425 stack should/could be splitted?
The main thing you lose is Etherchannel capability. My experience has been storage and compute platforms support failover that is not switch assisted, so this is usually not a big deal.