TWO MX IN HA MODE - backup mx fails to sync heartbeats

Solved
Captain
Getting noticed

TWO MX IN HA MODE - backup mx fails to sync heartbeats

Dear Forum,

 

The two MXs are connected as in the attached network diagram (Pic. 1).

However, the 2nd backup mx fails to sync = HA light constantly blinking in orange + power light is while steady.
 both MXs are online in dashboard. 

As a result, both MXs appear as Master on the dashboard.

 

If i power off SW02 and move the 2nd lan trunk port to SW01 (Pic. 2) = it still fails. 

There are no ports disabled due to spanning tree.

Tried rebooting all devices but it didn't help.

 

Last time I had this issue I opened a ticket with the support.

They said that the switch failed to pass VRRP traffic due to a spanning tree topology change.
Checked couple of times and there are no ports blocked due spantree. 

 

Any idea what it can be?

 

Thanks in advance!

 

Picture 1Picture 1

 

Picture 2Picture 2

 

1 Accepted Solution
Captain
Getting noticed

At some point I was considering faulty hardware due to bad VVRP packets checksum. 
So, I have decided to use another switch to have the LAN trunk from MX connected to.
Once the MXs were connected there the problem was resolved.
So, I went back to original switch to see what the difference is when i noticed on the original i had stp guard - bpdu guard active on these ports!
After disabling stp guard on the original switch - all went back to normal.

 

While in the documentation recommends enabling RSTP globally, it doesn't mention not to use STP Guard for those ports as it will cause issues. 

 

View solution in original post

8 Replies 8
ww
Kind of a big deal
Kind of a big deal

All mx and switch trunk ports use the same native vlan?

 

All switch-switch ports have stp enabled?

 

----- is not a real physical link ?

Captain
Getting noticed

All mx and switches trunk ports have rstp enabled - YES

--- is not a real link but signifies a virtual IP for LAN (there's another for WAN but as this is not a problem now i did not draw it).

Ryan_Miles
Meraki Employee
Meraki Employee

Can you post a screenshot on the MX LAN config. Meaning this stuff.

 

Screenshot 2024-12-17 at 07.58.45.png

Ryan

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
Captain
Getting noticed

Hi Ryan,

 

It is very simple...

Each MX has only one Lan port enabled.

 

1st MX Lan 5 port to SW01 port 47

2nd MX Lan 5 port to SW02 port 47

 

SW01 Port 48 <> SW02 Port 48

 

Regards,

 

Captain_0-1734452663076.png

 

 

 

ww
Kind of a big deal
Kind of a big deal

And all switch-switch connections have native vlan 1?

 

Switch 3 and 4 are also meraki?

 

Who is the root switch in your network?

 

Captain
Getting noticed

All switch uplinks are the same: vlan 1 as native + all vlan allowed. 

All other switches are Meraki. They are in secondary cabinets, each switch get one uplink from SW01 and another uplink from SW02. this is for redundancy if one core switch dies... 

the root switch is SW01 = 0
                     then SW02 = 8192
       and all the rest with default bridge priority 32768.

I have opened a case with Meraki support. At the moment they can't find anything odd with the configuration. just confirm that the VRRP packet are behaving strange.

My comment (not support): A packet capture shows all VRP packets have bad checksum.

 

alemabrahao
Kind of a big deal
Kind of a big deal

Is it possible to do a test by changing the ports from trunk mode to access mode? Just to validate one thing.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
Captain
Getting noticed

At some point I was considering faulty hardware due to bad VVRP packets checksum. 
So, I have decided to use another switch to have the LAN trunk from MX connected to.
Once the MXs were connected there the problem was resolved.
So, I went back to original switch to see what the difference is when i noticed on the original i had stp guard - bpdu guard active on these ports!
After disabling stp guard on the original switch - all went back to normal.

 

While in the documentation recommends enabling RSTP globally, it doesn't mention not to use STP Guard for those ports as it will cause issues. 

 

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels