Meraki MX Failover Event

Basha1996
Conversationalist

Meraki MX Failover Event

Hi Team,

 

I have 2  MX configured as HA in our network, and we are receiving multiple failover evet for MX02 for every one hour.The backup MX is going back spare status within few seconds after it has elected as master. While checking the MX WAN and LAN Connection, I don't see any issues. I don't see any VRRP logs from MX01 for MX02 become master as the priority values are same. Can anyone check and let me if i;m missing something here.

 

VRRPVRRP transitionif_up: 0, old_if_up: 1, mode: detect 
old_modedetect
prio235
old_prio235
elector_statebackup
last_state_change_reasonin_high_prio

 

VRRPVRRP transitionif_up: 1, old_if_up: 0, mode: detect 
old_modedetect
prio235
old_prio235
elector_statemaster
last_state_change_reasonmaster_down_timer
Meraki VPNVPN registry connectivity change

vpn_type: site-to-site, connectivity: false

5 Replies 5
cmr
Kind of a big deal
Kind of a big deal

We've had quite a lot of this with an HA pair of MX250s in VPN concentrator mode since going to v18.  We logged a ticket with TAC but they just suspected the switches (a stack of MS355-24Xs) but couldn't see any actual problem.  In the end we closed the case.  I've recently changed the uplinks from 1Gb to 10Gb DACs and hope that might fix it...

PhilipDAth
Kind of a big deal
Kind of a big deal

I have a client running a failover pair of MX250s in routed mode using TwinAx with no issues.

PhilipDAth
Kind of a big deal
Kind of a big deal

Being so re-producible - this smells like a firmware bug.  Make sure you are running "stable" or better.

DonaldB
Here to help

This sounds a lot like an issue we were seeing in 2022. While not as frequent as you are reporting here, we would see our pair of MX250 appliances perform a failover multiple times a day. At the time, Meraki support had to downgrade the version of SNORT we were running from v3 to v2, and that stopped the VRRP transitions from happening.

 

Meraki released a firmware last year that addressed this issue (firmware version 18), and we were eventually able to go back to SNORT v3 in October 2023. Ironically, I am currently browsing the community forums due to another issue that I think SNORT is causing us (killing network communications every 12ish hours), but that's another story...lol.

Basha1996
Conversationalist

Hi All, 

Thank you everyone for taking your time to respond on my query. This has been fixed, and it was identified that there were few clients in LAN which was keep sending IPv6 request to MX. The MX02 got overwhelmed due to this and unable to see VRRP from MX01, and assumes the MX01 isn't available and it started becoming master. After few seconds, it receives VRRP from MX01 and then it becomes standby.

The event log doesn't show any IPv6 DHCP logs and Meraki TAC informed us that it is visible only in backend, and they suggested to block IPv6 in Switch ACL until we remove IPv6 on those clients.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels