Hi Team,
I have 2 MX configured as HA in our network, and we are receiving multiple failover evet for MX02 for every one hour.The backup MX is going back spare status within few seconds after it has elected as master. While checking the MX WAN and LAN Connection, I don't see any issues. I don't see any VRRP logs from MX01 for MX02 become master as the priority values are same. Can anyone check and let me if i;m missing something here.
VRRP | VRRP transition | if_up: 0, old_if_up: 1, mode: detect
|
VRRP | VRRP transition | if_up: 1, old_if_up: 0, mode: detect
|
Meraki VPN | VPN registry connectivity change | vpn_type: site-to-site, connectivity: false |
We've had quite a lot of this with an HA pair of MX250s in VPN concentrator mode since going to v18. We logged a ticket with TAC but they just suspected the switches (a stack of MS355-24Xs) but couldn't see any actual problem. In the end we closed the case. I've recently changed the uplinks from 1Gb to 10Gb DACs and hope that might fix it...
I have a client running a failover pair of MX250s in routed mode using TwinAx with no issues.
Being so re-producible - this smells like a firmware bug. Make sure you are running "stable" or better.
This sounds a lot like an issue we were seeing in 2022. While not as frequent as you are reporting here, we would see our pair of MX250 appliances perform a failover multiple times a day. At the time, Meraki support had to downgrade the version of SNORT we were running from v3 to v2, and that stopped the VRRP transitions from happening.
Meraki released a firmware last year that addressed this issue (firmware version 18), and we were eventually able to go back to SNORT v3 in October 2023. Ironically, I am currently browsing the community forums due to another issue that I think SNORT is causing us (killing network communications every 12ish hours), but that's another story...lol.
Hi All,
Thank you everyone for taking your time to respond on my query. This has been fixed, and it was identified that there were few clients in LAN which was keep sending IPv6 request to MX. The MX02 got overwhelmed due to this and unable to see VRRP from MX01, and assumes the MX01 isn't available and it started becoming master. After few seconds, it receives VRRP from MX01 and then it becomes standby.
The event log doesn't show any IPv6 DHCP logs and Meraki TAC informed us that it is visible only in backend, and they suggested to block IPv6 in Switch ACL until we remove IPv6 on those clients.