I have a scenario where, at the access layer, we have 25 MS 125 switches; each switch is connected with two fibers, one to each switch in the SW-CORE stack, forming a Port-channel with 20G speed.
The SW-CORE stack is composed of two MS425-32 switches running MS version 16.8.
The problem is that when any switch in the stack is powered off, whether it's the active switch or a member switch, it seems transparent to the endpoints (tested with ICMP, pinging another machine on the same VLAN but on different access switches, i.e., traffic passes through the SW-CORE stack). However, when we power on any of these switches, whether it's the active or member switch, it causes a downtime of about 5 minutes (this downtime only occurs when the switch that was turned off is powered on again, whether it's the active or member switch).
I have opened a case with Meraki, but it hasn't been helpful. We escalated the case, and the engineer mentioned that Meraki does not support Stateful Switch Over (SSO) like the Catalyst switches and that there would indeed be downtime when a switch is powered back on.
In my research, I found two bugs that have been fixed:
Cross-stack LACP bundles experiencing a switch reboot will cause the remaining online port to experience an outage of up to 30 seconds. The same is seen again when the switch comes back online (present since MS 10), fixed in 16.1.
A stack containing more than 10 LACP bundles may encounter a brief network loop when a stack member is rebooted (present since MS 15), fixed in 16.8.
My switch infrastructure is composed of two MS425-32 switches in the CORE layer stack and 25 MS125 access switches, all running MS version 16.8.
Has anyone faced a similar issue?