How to cable MX & MS for HA

SOLVED
MoBrad
Conversationalist

How to cable MX & MS for HA

What's the recommended way to cable and configure 2x MX250 operating in HA NAT, and connecting to 2xMS350-24 which are stacked?

 

My thoughts are that this requires 4 GbE connections.... Would aggregation be required for the 4 ports?

Primary MX

GbE 3 to MS350-1, port 1

GbE 4 to MS350-2, port 2

 

Spare MX

GbE 3 to MS350-2, port 1

GbE 4 to MS350-1, port 2

1 ACCEPTED SOLUTION
PhilipDAth
Kind of a big deal

The way you have suggested is the way I do it.

 

However Meraki's recommendation is a cable from each MX to a switch, and then a cable directly between the MX's.

https://documentation.meraki.com/MX-Z/Other_Topics/Troubleshooting_MX_Warm_Spare_in_NAT_Mode_(NAT_HA...

I don't agree with this approach, because you often get all the traffic being switched through the spare to get to the active unit thanks to spanning tree often choosing to block the link to the active MX.

 

 

View solution in original post

79 REPLIES 79
PhilipDAth
Kind of a big deal

The way you have suggested is the way I do it.

 

However Meraki's recommendation is a cable from each MX to a switch, and then a cable directly between the MX's.

https://documentation.meraki.com/MX-Z/Other_Topics/Troubleshooting_MX_Warm_Spare_in_NAT_Mode_(NAT_HA...

I don't agree with this approach, because you often get all the traffic being switched through the spare to get to the active unit thanks to spanning tree often choosing to block the link to the active MX.

 

 

MoBrad
Conversationalist

Thanks @PhilipDAth. But would you configure all 4 ports to be in 1 aggregate on the MS stack?

I've actually got a direct connection between the 2 MX's as well but have set this to be access with only a non-routable VLAN which has been pruned from the switch uplinks. The issues that you describe are exactly what we've experienced recently so am looking to sort things out.

PhilipDAth
Kind of a big deal

The MX does not support port aggregation - so I would not.

Adam
Kind of a big deal

I'd do what you suggested with twinax cables for high throughput. 

 

Adam R MS | CISSP, CISM, VCP, MCITP, CCNP, ITILv3, CMNO
If this was helpful click the Kudo button below
If my reply solved your issue, please mark it as a solution.

Found the Gnome 

GreenMan
Meraki Employee

A couple of thoughts to add:

MX doesn't run STP itself, but it will forward BPDUs, so if you create any loops, they'd need to be resolved in the switching.  Probably best not to create them in the first place.

Any heartbeat link directly between the MXs should be in a dedicated VLAN.

 

In addition to the published MX documentation.meraki.com this is a useful unofficial resource (though created by a Meraki SE):  https://www.willette.works/mx-warm-spare/ 

jdsilva
Kind of a big deal

@GreenMan

 

There are issues with Willette's topology, and I would not suggest using it. The first is that VRRP heartbeats are sent on all VLANs, so creating a dedicated "heartbeat VLAN" is not actually possible. Second, what you can do is use this dedicated VLAN for DHCP database synchronization... But because there are no actual knobs in the dashboard to configure this the MX will use the link that comes up first. That's great if your dedicated VLAN comes up first, but if not then it could be any of the other links to the switches making this non-deterministic. 

 

IMHO this architecture is incorrect and falsely gives people the impression they are controlling something they cannot actually control. 

 

@PhilipDAth's suggestion is the best topology to use here. 

Hi @jdsilva - I should probably have called it a heartbeat link (although it would need a dedicated VLAN.) The idea of that path is for it to be as simple as possible (least likely to fail), avoiding dual-active MX scenarios.   There are indeed a number of ways of engineering such setups - I guess testing your preferred approach, in your customers actual network, taking into account likely failure scenarios (perhaps using a free trial) is always the best recommendation, rather than being fixed on any one topology as ‘best’.

jdsilva
Kind of a big deal

@GreenMan Yup, I'm with you on the simple path part. I'm just saying that the dedicated VLAN over a dedicated link for "heartbeats" is flawed thinking as VRRP doesn't work that way, and you can't deterministically predict where the DB sync traffic is going. 

 

What is needed to complete this setup is a way to flag the heartbeat VLAN as the heartbeat VLAN. Right now there is no such control on the MX.

jdsilva
Kind of a big deal

...And, Meraki has officially changed their documentation on this. The heartbeat cable is no longer a recommended configuration.

 

https://documentation.meraki.com/MX-Z/Deployment_Guides/NAT_Mode_Warm_Spare_(NAT_HA)#Recommended_Top...

 

Yay!

My thoughts:

While VRRP packets will flow thru all VLANs, having a dedicated physical link on its own dedicated VLAN that VRRP packets flow thru allow for the shortest path on a VLAN that is exclusively VRRP packets. True, VRRP will go out all VLANs -- but in case of any sort of congestion or link failure in the switch stack, you have a dedicated link and VLAN that will still allow VRRP packets to make it to the warm spare. So, IMO, it is still advisable to use a dedicated link with a dedicated VLAN to ensure timely arrival of VRRP packets to the warm spare without having to worry about the rest of the network.

And Meraki has not changed their documentation, at least not fully, on this.

https://documentation.meraki.com/MX-Z/Other_Topics/Troubleshooting_MX_Warm_Spare_in_NAT_Mode_(NAT_HA)

That page was not updated due to an oversight I believe. They are aware now that it is outstanding and hopefully it'll get changed soon.

 

In my experience the issues caused by creating the loop at L2 on devices that do not participate in STP are far more detrimental than having VRRP frames pass through one switch between MXes. I would agree that you don't want your VRRP to take the scenic route through your switch fabric to get to the other MX. But if you have problems getting VRRP through a single switch before the dead timer expires then you really have much bigger problems that you need to be looking at.

 

 

@JasonCampbell you say:

"While VRRP packets will flow thru all VLANs, having a dedicated physical link on its own dedicated VLAN that VRRP packets flow thru allow for the shortest path on a VLAN that is exclusively VRRP packets. True, VRRP will go out all VLANs -- but in case of any sort of congestion or link failure in the switch stack, you have a dedicated link and VLAN that will still allow VRRP packets to make it to the warm spare. So, IMO, it is still advisable to use a dedicated link with a dedicated VLAN to ensure timely arrival of VRRP packets to the warm spare without having to worry about the rest of the network."

 

I don't think you understand the purpose of VRRP.  VRRP is a protocol to provide protection for the default gateway of a VLAN.  It allows clients configured with that default gateway to remain working during a failure.

 

Having VRRP running on a dedicated VLAN between two MX units is - pointless.


@PhilipDAth wrote:

@JasonCampbell you say:

"While VRRP packets will flow thru all VLANs, having a dedicated physical link on its own dedicated VLAN that VRRP packets flow thru allow for the shortest path on a VLAN that is exclusively VRRP packets. True, VRRP will go out all VLANs -- but in case of any sort of congestion or link failure in the switch stack, you have a dedicated link and VLAN that will still allow VRRP packets to make it to the warm spare. So, IMO, it is still advisable to use a dedicated link with a dedicated VLAN to ensure timely arrival of VRRP packets to the warm spare without having to worry about the rest of the network."

 

I don't think you understand the purpose of VRRP.  VRRP is a protocol to provide protection for the default gateway of a VLAN.  It allows clients configured with that default gateway to remain working during a failure.

 

Having VRRP running on a dedicated VLAN between two MX units is - pointless.


Hi Philip,

 
I understand VRRP as a first hop redundancy protocol that, not by standard, but on Meraki equipment, VRRP heartbeats are sent out all Vlans and as long as the warm spare finds one of these, it considers primary still live. Having a VLAN with absolutely no traffic theoretically provides benefits to this VRRP heartbeat packet. Please feel free to provide evidence if I'm wrong. 

I think this actually is detrimental to your network. You are creating a "shortcut" path that is not representative of the path your clients will use. This dedicated VLAN heartbeat cable leaves you wide open to the scenario where your clients lose connectivity to the active MX, but the active MX does not relinquish control to the secondary thereby taking your entire network down. You want VRRP to move over the path your clients actually use. 

@JasonCampbell the only test that is used to determine if a unit is alive is weather it can talk to the cloud.  VRRP is used strictly for a FHRP.

 

An MX only participates in VRRP if it can talk to the cloud.  If it loose that connection it stops speaking VRRP completely.

MoBrad
Conversationalist

Appreciate the great discussion here. Whilst I'd originally planned to replicate https://willette.works/mx-warm-spare/ I'll now change our approach back to the updated method without the direct-connect cable. Cheers all.

While VRRP does indeed provide a resilient next-hop (either for clients or, for another example, an upstream router) it's also used for the two MXs to monitor each other.   If you have all your inside VLANs running over the same physical infrastructure (switches / fibre etc.) then a failure within that layer could result in both MXs becoming active.  Having as direct a path as possible between the two, separate from that shared infrastructure, to prevent active-active, is the basic aim of such a link.

Not to necro this old thread but does any of this discussion change with the recent addition of the cellular MX models?

Nolan Herring | nolanwifi.com
TwitterLinkedIn

No.

Whilst it's not related to the cabling, please note that, at this point in time, the new MX67 and MX68 models do not yet support VRRP (i.e. warm standby).  This will be added in a future firmware release.   Whilst writing, the same goes for wired 802.1x

PhilipDAth
Kind of a big deal

>Whilst it's not related to the cabling, please note that, at this point in time, the new MX67 and MX68 models do not yet support VRRP (i.e. warm standby).  This will be added in a future firmware release.   Whilst writing, the same goes for wired 802.1x

 

Wow, I mean wow!  I haven't heard that one.

jdsilva
Kind of a big deal

Yeh that's a nasty little gotcha. I hadn't seen that mentioned anywhere else yet. Good to know.

Seriously? Lovely how that's not mentioned anywhere in the product documentation.

There is this snippet here on this document:

 

https://documentation.meraki.com/MX/Deployment_Guides/NAT_Mode_Warm_Spare_(NAT_HA)

  

 

Cellular Failover Behavior
Meraki does not currently support any cellular failover with a high availability (HA) pair; as we do not perform connection monitoring on cellular uplinks (as of MX 10.X+), which is necessary for HA uplink failover. At this time, if a cellular uplink is used in an HA pair, the following will occur in order:

 

Primary MX WAN 1+2 fails > fails over to Secondary MX
Secondary MX WAN 1+2 fails > fails over to Primary MX Cellular
Primary MX cellular fails > fails over to Secondary MX Cellular


While it is possible to use cellular failover as described above, it is not officially supported by Meraki.

 

Nolan Herring | nolanwifi.com
TwitterLinkedIn

That has nothing to do with the fact that VRRP isn't supported AT ALL on MX67/MX68 currently. That should have been made more prevalent.

VRRP is only used as part of the warm standby mechanism, on MX

I don't understand what that means. MX67/68 is able to do warm standby without VRRP? How does that work?

I have two MX67C right now and I was testing them. I don't have anything plugged into them yet (no LAN etc.) So I had to do the direct-connect cable between them on port 5. Before I connected them, they both showed as 'Current Master'.

 

Once I plugged the cable into the spare, it changed to 'Passive; Ready' status.

 

Running 14.34

 

vrrp.jpg

 

 

 

 

Nolan Herring | nolanwifi.com
TwitterLinkedIn

Hi Everyone,

 

@JasonCampbell@jdsilva@PhilipDAth @MoBrad

 

I manage our documentation. Hopefully I can clarify a few things here. First:

 


@GreenMan wrote:

Whilst it's not related to the cabling, please note that, at this point in time, the new MX67 and MX68 models do not yet support VRRP (i.e. warm standby).  


This is not correct. MX Cellular models DO support VRRP/warm standby, just not using cellular. You can absolutely set up MX Cellular models in an HA pair configuration, we just recommend doing so without LTE.

 

This is what your recommended failover options look like:

 

MXC-Recommended-Failover_Designs.png

On the left, you can use MX Cellular models just like any other MX model in a standard HA pair. The left-side design is the recommended HA design for ALL MX models. Notice that the topology matches the documentation. The design on willette.works is not an official Meraki design and is not recommended.

 

So you might be thinking: "The design on the left is using cellular MX models but isn't using LTE failover at all. What's the point in using cellular models?" - If your priority is HA failover, you may not want to use an MX Cellular model. If your priority is LTE failover, we recommend the design on the right side. If you happen to have an MX Cellular model on hand and WANT to do HA failover, the design on the left side is officially supported and recommended.

 

The design on the right side is how we officially recommend using LTE failover. On a single MX Cellular device, not in an HA pair.

 

Alright so, the next question is: "Alright, well what would happen if I DID set up LTE in an HA pair?"

This is answered in the HA documentation, as @NolanHerring pointed out.

 

Cellular Failover Behavior
Meraki does not currently support any cellular failover with a high availability (HA) pair; as we do not perform connection monitoring on cellular uplinks (as of MX 10.X+), which is necessary for HA uplink failover. At this time, if a cellular uplink is used in an HA pair, the following will occur in order:

 

Primary MX WAN 1+2 fails > fails over to Secondary MX
Secondary MX WAN 1+2 fails > fails over to Primary MX Cellular
Primary MX cellular fails > fails over to Secondary MX Cellular

 

What does that actually look like? Here's a diagram to clarify:

 

MXC-HA-LTE-Failover_Behavior-Not_Recommended.png

 

This diagram shows what happens when you use LTE in an HA pair. Note the RED TEXT at the top saying that this isn't officially supported or recommended. This is just to clarify current device behavior.

 

Cheers.

Cameron Moody | Documentation Manager, Cisco Meraki


@CameronMoody Great post! Thank you very much for the clarification.

 

@CameronMoody wrote:

 

The design on willette.works is not an official Meraki design and is not recommended.

 

 


I'm also very glad to see that. I'm not a fan of that configuration at all. 

Thank you very much for the response Cameron !

Nolan Herring | nolanwifi.com
TwitterLinkedIn

@CameronMoody

Thanks for the clarification. That makes much more sense.

Many thanks to @CameronMoody for the clarification;  FYI he's also arranged a tweak to some of our internal information, that lead to my ovely simplistic (OK, OK;  inaccurate) comment regarding VRRP.

Kirk
Conversationalist

@CameronMoody 

 

Is there a plan to eventually support a cellular HA pair? We currently have some Cisco 2921 routers that have an LTE failover setup. Was hoping this would come to our meraki devices as well. I would be willing to test some beta firmware if/when it becomes available.

 

Thanks for the informative thread!

PhilipDAth
Kind of a big deal

You can use cellular in an HA pair - just the cellular links are the last to be used.

 

Note you can't do "HA NAT" as the cellular interfaces can not share an IP address (the same as your 2900's), but it will still fail over. Any in-progress TCP sessions will be lost and will need to be restarted.

What @PhilipDAth said.

 

Scroll down and you'll see real world fail over with cellular being used.

 

https://nolanwifi.com/2018/10/25/you-down-with-l-t-e-yeah-you-know-me-raki/

 

 

Nolan Herring | nolanwifi.com
TwitterLinkedIn

@NolanHerring That's an awesome blog page. Super kudos.

 

We did fix the LED lights in the install guide, good catch.

Cameron Moody | Documentation Manager, Cisco Meraki

@Kirk Regarding VRRP over LTE, I don't have that information, sorry!

 

Nolan has some clever suggestions for workarounds for now though.

Cameron Moody | Documentation Manager, Cisco Meraki

I found the gnome!!!

CarolineS
Community Manager

YEAH! Nice work, @jdsilva! I think my hint was a bit too helpful... now I'll return MV gnome to my bag and fasten it tightly!

Caroline S | Community Manager, Cisco Meraki
New to the community? Get started here
jdsilva
Kind of a big deal

@CarolineS I actually went the wrong direction again at first. But that road didn't go very far, so I reread the post and got on the right track. After the last one @Adoos found I figured out what you were doing 😉

BrandonS
Kind of a big deal

Doh! Just moments too late for the gnome..

Nice try, @BrandonS! I bet MV will get restless tomorrow at the office too. Just a hunch.

Caroline S | Community Manager, Cisco Meraki
New to the community? Get started here

H

does the LTE failover only work when the MX WAN1 and WAN2 physical links are down only?

 

I am seeing weird instances when the WAN links can still connect and ping their gateway to their connection to the cable modem or circuit ISP router that is still online but the internet peering next hop is down. From there it does not failover to the LTE.

 

what steps does the MX try before it determines this? Does it do some sort of SLA?

I tested many scenarios but the best to get this to work is to have a designated vlan set on the MX for the heartbeat for example vlan 1111 ip 1.1.1.1/30.

 

then connect a cable between each MX for example MX1-e3 to MX2-e3 and set it as a trunk with the native vlan 1111 and allow all vlans (do not prune or pick any other vlans specified in your MX)

 

then your connections from the MX to the Skitch facing switches. Also if you are not using a Meraki switches like Cisco Catalyst or Arista then I set the Spanning of tree cost for the primary port to be 100 and the second link to the MX to 200

 

MX1-E4 —> SW1-E4 

MX1-E5 —> SW2-E4

MX2-E4 —> SW1-E5

MX2-E5 —> SW2-E5 

 

jdsilva
Kind of a big deal


@LuigiJuve wrote:

I tested many scenarios but the best to get this to work is to have a designated vlan set on the MX for the heartbeat for example vlan 1111 ip 1.1.1.1/30.

 


Can you please show me exactly how you designate a VLAN as heartbeat?

 

 

like this

 

HA-1.pngHA-2.pngHA-3.png

jdsilva
Kind of a big deal

Hi @LuigiJuve ,

 

No where in there did you designate that VLAN as heartbeat. You simply created a VLAN, assigned it a subnet. Can you please clarify what makes this VLAN a heartbeat VLAN different from all the other VLANs?

 

How do you control which VLAN is used for hearbeats? 

You can't designate. Heartbeats for VRRP are sent out across all VLANs by default, and there is no way to configure them to do otherwise. At this point that designated VLAN is probably redundant logically, maybe not physically if you separated it out (not sure if that is worth it either).

Nolan Herring | nolanwifi.com
TwitterLinkedIn

I think @jdsilva understands this entirely, he's just trying to get @LuigiJuve to arrive at that answer on his own.... 😃 

jdsilva
Kind of a big deal


@nickydd9 wrote:

I think @jdsilva understands this entirely, he's just trying to get @LuigiJuve to arrive at that answer on his own.... 😃 


Shhh! 😉

Like I said I use a direct link between the two MX units.

I specified a direct VLAN for only that link and setting it was the native vlan trusting all vlans to be used as the heartbeat. maybe it only works logically so be it but it works for over 15 setups flawlessly once I did it this way and the LAN base links if one of them goes down and helps prevents the Master to Master HA issues..

 

Try it, if you don't like so be it figure it out another way.

jdsilva
Kind of a big deal

Hi @LuigiJuve,

 

I'm not saying it won't work. What I'm saying is that it's not possible to configure a "heartbeat VLAN" as described in that document. It's a completely false pretense to think that this supposedly magical VLAN is any different from any other VLAN you have configured. As @NolanHerring pointed out, VRRP hellos (the "heartbeats") are sent out on every VLAN. Further, Meraki's implmentation of VRRP has elected to use a single Virtual Router with multiple Virtual IP addresses as opposed to a Virtual Router for each VLAN, meaning, that as long as a VRRP heartbeat makes is to the standby unit on ANY vlan the current active master will remain the master for all VLANs. It's not possible for only one VLAN to fail over. This is truly and Active/Standby only implementation.

 

So please forgive me if this sounds harsh as it's not meant to, but your 15 installations are not working because you followed that guide, and they also are not working the way you think they are working.

 

This is actually my objection to this configuration, and Aaron Willette's design. It's based on a falsity, and that falsity has been propagated to many corners of the Internet. Quite often people chime in here in the Community and reference that design, and nearly every time the person making the reference has no idea that it's simply not possible to actually designate a VLAN as a "Heartbeat VLAN".

 

My preferred way to do Meraki Warm Spare, which has come from my own experience deploying Warm Spare for numerous customers, and through the very insightful advice of @PhilipDAth, is to NEVER EVER directly connect the two MX appliances together, and in stead singly connect each MX to a single switch or switch stack (the latter preferred for redundancy). See, having the MX's directly connected actually can cause issues under certain failures with Spanning Tree convergence. Further, it's also desirable to have the VRRP heartbeats traverse a path more representative of the path that user data actually takes. By creating a shortcut that heartbeats can take, but not user traffic, you are actually exposing yourself to a failure scenario where user traffic is blackholed, but VRRP is operation just fine and not allowing a failover to occur.

 

Failovers are a good thing! You want failover to occur when there's a failure. A direct link cheats the system and can actually prevent a failover when you actually want one.

 

I'll close this by pointing out that Meraki has actually changed their recommended topology for Warm Spare by removing the direct connection between MX's. It took me some time to convince them to do it, but they did come around 🙂

 

https://documentation.meraki.com/MX/Deployment_Guides/MX_Warm_Spare_-_High_Availability_Pair#Recomme...

 

 

jdsilva, so what do you mean my 15 installations are not working please explain..

 

In my setups I am not using Meraki switches but either Arista or Cisco Nexus not using Stacking but using MLAG or VTP.

 

Sure I get it, you do not need a set a VLAN to determine the heartbeat because VRRP is used on all VLANs sure but what if you connect the MX pairs to the south facing switches with only certain VLANs only and not the one VLAN you set for the heartbeat like in my example..

 

I know Meraki removed the peer link between MX HA pair but no one can explain to me why this is bad and why the failover will not occur when it happens..

 

You seem to know all the answers so explain it.

>My preferred way to do Meraki Warm Spare, which has come from my own experience deploying Warm Spare for numerous customers, and through the very insightful advice of @PhilipDAth, is to NEVER EVER directly connect the two MX appliances together, and in stead singly connect each MX to a single switch or switch stack (the latter preferred for redundancy).

 

Not saying it won't work another way, but this approach produces a rock-solid solution.  When you dual connect them (or run a connection between them) you are far more likely to have outages because of the extra redundancy than you would have had otherwise with the simpler design.  Typical issues are in-appropriate spanning-tree port blocking, and short term spanning-tree loops.

nikkydd9.. what exactly are you trying to say here...

 

 

RyanB
Meraki Employee

Nothing to see here.. haha.

jdsilva
Kind of a big deal

@RyanB I dispute it's greatness. See my comments above 🙂

RyanB
Meraki Employee

I'll see myself out! Great discussion.

jdsilva
Kind of a big deal

@RyanB Not at all!  Join the conversation and bring forward ideas!  If my assessment of that article is incorrect please by all means call me out on it 🙂

 

 

typeraj
Here to help

Sorry to dig up this old thread, but hoping someone can clarify something for me.

 

I'm looking to replicate the 'Fully Redundant (Multiple Switches)' setup from the HA documentation

using dual WAN links, two MX100s and two MS120s. However, the documentation doesn't really detail how WAN1 and WAN2 connect to both MXs.

 

From what I understand the willette.works design is not supported because it recommends a direct connection between the MXs for the VRRP heartbeats. However, earlier in the post, he talks about splitting the two WAN links using a breakout switch so that both MXs have a connection to both WAN links. Is this a good way of doing it? 

 

 

Newbie here - go easy please 🙂

PhilipDAth
Kind of a big deal

Typically you get your ISP to provide at least two ports in the same VLAN, and you plug WAN1 from both MX into that device.  Ditto for the second ISP.

 

You can also use an external switch for this (plug in ISP and WAN1 on both MX).  Repeat for WAN2.

 

You can also use an internal switch.  Create a VLAN and put three ports into it in access mode.  Plug in ISP1 and the WAN1 ports of box MX.  Repeat for WAN2.

Unfortunately, neither ISP offers a second handoff but that would have been my preference. I'm hesitant to use an internal switch since ports are at a premium here. I have a spare MS220-8, which I think I can use as the external switch to do what you said. Thanks @PhilipDAth.

Just keep in mind that while you can create two VLANs and assign 3 ports to each making this work, by doing so you are leaving the MS-220-8 as a single point of failure which somewhat defeats the purpose of using a second MX, dual ISPs, etc.  Might be better to use the MS-220 for one ISP and use the other ports as edge ports then grab 3 ports on another switch for the other ISP. 

Sorry to re-hash an old thread, but on the topic of WAN breakout switches, I have always used a Cisco 8-port switch to break out an ISP that can only provide a single handoff. This switch is a simple configuration of a VLAN and we don't manage the switch any further and just deploy it. If it breaks we ship a new one with the same config. 

 

If I was to begin using Meraki 8-port switches for WAN breakout is it essentially the same concept? My concern is that having an MS120-8 upstream of my MX's will not work because they will not be able to communicate with dashboard. In addition to cabling the switch into my modem and both MX-A WAN 1 and MX-B WAN 1, would I also patch it in to the MX's or downstream switching on a trunk so the switch can be managed and communicate with dashboard?

 

I understand I can just use spare switchports on the downstream switching to breakout my ISP connection instead of using a separate WAN breakout switch, but again my question is around a WAN breakout switch upstream of the MX's. 

PhilipDAth
Kind of a big deal

Here is a good explanation:

https://documentation.meraki.com/MX/Networks_and_Routing/NAT_HA_Failover_Behavior#VRRP_Mechanics_for... 

 

My understanding is if VRRP fails on any VLAN the entire unit fails over.  Having a dedicated link between the units will have no impact if a common switch (on another port in another VLAN) fails.

Typically, you want the failover to happen.  No point the master continuing on with a VRRP failure on any specific VLAN.

thank you..

 

I was using this guide when setting these up. https://www.willette.works/mx-warm-spare/

 

not all my setups have this dedicated VLAN approach, but most do. 

I know understand it is kind of pointless and will remove this for future deployments.

 

As for the direct connections between MX units I will test more but it seems to work well and always worked well even with the LAN based failover recommended by Meraki..

jdsilva
Kind of a big deal

@LuigiJuve wrote:

jdsilva, so what do you mean my 15 installations are not working please explain..

Sorry, I think I didn't make that statement very well. I was trying to say that yes, your deployments are probably working just fine under normal network conditions. But, based on your comments around the heatbeat VLAN I do not believe they are working the way you think they are working. If you believe that you have a dedicated VLAN that's being used purely for heartbeats then you are operating under an incorrect assumption, and you don't fully understand how your network is operating. Again, I'm not trying to be harsh, I'm simply trying to point out an incorrect assumption and help you understand the way Warm Spare actually works 🙂

 

 

 


@PhilipDAth wrote:

 

My understanding is if VRRP fails on any VLAN the entire unit fails over. 


I don't think this is true, and the way I read the doc you linked to doesn't support that. 

 

image.png

Because the MX uses a single VIrtual Router with multiple Virtual IP's it can't actually detect a failure of a single VLAN. The heartbeat sent on each VLAN is identical to the heartbeat sent on every other VLAN. The only thing that the MX VRRP process is able to detect is whether it received a heartbeat at all within the dead time. As long as any heartbeat reaches the standby it will not assume the Master role. It must have a complete and total loss of all heartbeats on all VLANs to make the transition.