What *IS* the functional way for HA warm spare + L3 switch stack

Solved
Aaron_Wilson
A model citizen

What *IS* the functional way for HA warm spare + L3 switch stack

Ok, having a mental block for correct wiring. Here is the lay of the land:

 

-Dual MX67 for VPN back to mothership

-Cisco 9300 switch stack running layer 3 for the branch location (/24)

-Primary MX connects to SWX 1

-Warm spare MX connects to SWX 2

 

Over the years I know the Meraki design preference has shifted away from direct link between each Meraki for VRRP, but I'm seeing issues since the MX's do not participate in spanning tree.

 

Here is the diagram I modified from Willette's picture for what I believe to be correct.

 

If I have layer 3 hand-off from MX to switch stack, and the switch stack is running layer 3 for local LANs, is it OK to do direct MX to MX VRRP connection?

 

If so, uplinks between MX and switch stack is ok to be access? One MX going to SWX 1, while warm spare goes to SWX 2?

 

If SWX 1 fails will warm spare MX pass the traffic? Or do I need to trunk VLAN 100 between the two MXs with VLAN 1111 being default?

 

MXHA.png

1 Accepted Solution
Bruce
Kind of a big deal

Yep, that's it. Just be aware that the MXs don't run STP, so any BPDU that the switch sends is just forwarded back to itself on the other path. So the switch will learn about the Layer 2 loop and put the inferior port into blocking state. From this point of view its ideal if the Cisco switch ports facing the MX don't run Portfast, as you want the STP to converge before you forward traffic.

View solution in original post

6 Replies 6
rhbirkelund
Kind of a big deal
Kind of a big deal

If you're doing L3 handoff on the Switches, rather than the MX, in principle, you'll only have 1 local VLAN on the MX which will serve as a form of transit VLAN. 

 

If you just make sure to Trunk the VLAN between both MX'es and the Switch, you should be fine VRRP-wise. One of the requirements of Warm Spare is to not prune VLANs.

VRRP heartbeats will flow on all VLANs created on the MX. So if you only have 1 vlan on the MX (serving as the transit VLAN to the switch) 

 

Regarding failover incase of a switch failure, if MX1(Pri) is connected to SW1, and MX2(Sec) is connected to SW2, w/o the back-to-back link, if SW1 fails, failover from MX1 to MX2 will occur. 

 

What issues are you seeing?

Regarding Spanning-Tree on Cisco Classic and Meraki, you'd might want to consider running MSTP (I think?  You'd might want to reference documentation on that..), eventhough MX'es don't participate in Spanning-Tree.

LinkedIn ::: https://blog.rhbirkelund.dk/

Like what you see? - Give a Kudo ## Did it answer your question? - Mark it as a Solution 🙂

All code examples are provided as is. Responsibility for Code execution lies solely your own.
Bruce
Kind of a big deal

I’d do one link from SWX1 to primary MX, and a second link to the primary MX from SWX2. Likewise with the standby MX, one link to SWX1 and another to SWX2. Have all links configured the same, and let STP take care of blocking one of the links to each MX. You don’t need the link between the MXs and support will likely recommend you remove it if you need their assistance.

 

In your proposed configuration, if SWX1 fails traffic won’t flow to the active MX. Since there are still heartbeats being received by the standby MX (across the direct link between them) it won’t take control, and so the LAN IP addresses (i.e. 10.1.1.1.) will remain isolated on the primary MX with no path via the direct link (since the VLAN isn’t on that link).

rhbirkelund
Kind of a big deal
Kind of a big deal


@Bruce wrote:

I’d do one link from SWX1 to primary MX, and a second link to the primary MX from SWX2. Likewise with the standby MX, one link to SWX1 and another to SWX2. Have all links configured the same, and let STP take care of blocking one of the links to each MX.

 

 Why would you run redundant link from the Primary MX, as well as the Secondary when running Warm Spare? Is it really that important to keep using the Primary MX?

 

If MX-Pri is connected to SW1 in a stack, and MX-Sec is connected to SW2 in a stack. If Sw1 fails, failover to MX -Sec will occur, and failover time is  less than 30 seconds, from failure detection to be processing VPN packets again. 

 

Redundant links between MX and switch would imho, only make sense you were aggregating links, but since the MX does not support LACP, that's out of the picture.

 

Depending on your Spanning-Tree domain, when the network converges, you'll might already have exceeded the time it takes for Warm Spare failover. 

LinkedIn ::: https://blog.rhbirkelund.dk/

Like what you see? - Give a Kudo ## Did it answer your question? - Mark it as a Solution 🙂

All code examples are provided as is. Responsibility for Code execution lies solely your own.
Bruce
Kind of a big deal

@rhbirkelund its to cover a link failure between the MX and switch (or indeed a failure of SWX1 itself). If you lose the link from the Primary MX to SWX1 (or SWX1 itself) then both the MX devices will go active, both will try and build tunnels to the hub (and so advertise themselves as the path to their subnets), and if you are running a VIP on the WAN both MXs will claim the VIP. All of this can cause issues of different magnitudes depending on your design.

 

This occurs because the heartbeats between the MXs stop (due to the link or switch failure), but the MXs are still both alive and so believe the other MX is dead. Having both the links from each MX to both switches adds a layer of protection you wouldn't get otherwise.

 

It would be great to hear that if there are no active links on the LAN side then a MX won't go 'active', it will assume a standby role, but so far as I'm aware this isn't the case.

Aaron_Wilson
A model citizen

Thanks both. It's all making sense now. The mental block I was having was in the Meraki examples of a switch stack, but that was a layer 2 stack, not a layer 3 stack.

 

Ok, so if I run MX 1 to swx1 and swx2, spanning tree on the Cisco switch will shut the swx2 port. And if swx1 fails swx2 will go to forwarding?

 

 

Bruce
Kind of a big deal

Yep, that's it. Just be aware that the MXs don't run STP, so any BPDU that the switch sends is just forwarded back to itself on the other path. So the switch will learn about the Layer 2 loop and put the inferior port into blocking state. From this point of view its ideal if the Cisco switch ports facing the MX don't run Portfast, as you want the STP to converge before you forward traffic.

Get notified when there are additional replies to this discussion.