Branch MX HA Design

d2
New here

Branch MX HA Design

Hi, I am looking to replace 2 ISR 2900s at one of our branches with a pair of MX67s as part of an SD-WAN pilot. We'll provision a pair of MX100s in our DC. We currently have diverse carrier MPLS links at the branch, with 1 link terminating on each of the 2900s. The carrier links are using /30s and we use BGP to advertise the sites routes to the carrier. Connectivity from the current WAN routers to the LAN routers use /30s and run EIGRP on the LAN side. As part of the pilot, we'd like to install a new business grade internet link for offloading DIA (as well as the secondary SDWAN path), and after the a pilot, decommission one of the carrier MPLS links. 

 

When looking at the MX HA docs, I believe that I would need to make the following changes. I want to double check that there aren't any other things that I need to consider.

 

- Only the primary MX in the HA pair forwards the production traffic, so to utilise both the MPLS and the internet link, as well as have automatic failover, I'd need to have the MPLS and internet links terminate on both the primary and redundant MXs. To achieve this either request the carrier to provide a second uplink from their NTU and bridge the 2 connections on their side (if even possible), or add a WAN-side ethernet switch that would sit between the MX and the carrier NTU to provide the L2 connectivity (additional point of failure). Alternatively I could use the existing LAN switch to cable the carrier links and the MX WAN link in and provide the connectivity that way (seems messy).

 

- Transition the existing carrier MPLS connectivity from a /30 to a /29 to accommodate the connectivity from the second MX. The new internet link would be ordered with a /29. This would ensure that both of the MXs have external connectivity to reach the cloud for mgmt and for uplink connectivity tests. I'd then configure WAN virtual IPs between the HA pair. Is an AutoVPN tunnel established from both the primary and standby MXs?

 

- Change the routing to the carrier from BGP to static routes. The MX doesn't need dynamic routing anymore as it's failover is based on the MX HA, and the path selection is done via policy within the dashboard.

 

- I will need to transition the /30s from the WAN to the LAN routers to /29s as the MXs use VRRP to talk on the LAN side as part of the HA heartbeat. The routing from the LAN switch would need to change to use static routing with the downstream LAN device pointing to the LAN side VRRP address. Can I use routed mode on the MX LAN or do I need to use Vlans?

 

- What visibility do I have of the default firewall protection policy on the internet link that will terminate on the MX? Currently all internet traffic is bought back to our DC and exits via a central egress. The security is controlled by Checkpoint firewalls so our policy is well managed, visable and logs from Checkpoint are quite good.

 

Any other points that I am missing?

 

Cheers

5 Replies 5
Bruce
Kind of a big deal

Hi @d2 , here's some confirmation of your points, and some considerations.

 

  • You are correct, only Active MX forwards traffic, but the Standby MX does need to have connectivity to the Meraki cloud so that it can report in, receive firmware upgrades, etc.
  • Sometimes its a struggle (or expensive) to get the MPLS carrier to provide that second port and additional IP addresses on their CPE device. Consider your failure scenarios as you're only likely protecting against the failure of the MX in this case. Can you make do without the MPLS circuit temporarily if an MX fails? You'll still have connectivity to data centre via the AutoVPN over the internet.
  • Consider your option for the internet links. Do you need a internet link with a /29, what's the price? You may be able to get two separate internet links from two separate carriers instead. The circuits on the WAN side don't need to be in the same subnet, they don't have to have a vIP and they can be completely independent, VRRP doesn't run between the WAN ports.
  • On the LAN side of the MX, yes, they do use VRRP, but both MX appliances share a single IP address (VRRP runs at Layer 2 in this instance), so you can keep your /30 between the MX and the switches.
  • Changing the carrier routing to static is a definite. BGP on the MX is for within the SD-WAN (iBGP) and integrating to the SD-WAN head-end data centre (eBGP). Its not intended for integrating with a carrier running BGP.
  • With regards the visibility, that depends how you set it up. By default the MX allows all outbound internet and all the return traffic, like a normal stateful firewall. But with the AutoVPN/SD-WAN you can force all traffic to your central site still if you'd like. Or, if you purchase the SD-WAN Plus license, then you can do application specific breakout at the branch site, and still tunnel the other traffic to the data centre. Or you can used direct internet access for all traffic at the branch, and only internal traffic across the SD-WAN.

I'm sure others will have more comments and suggestions too. 

cmr
Kind of a big deal
Kind of a big deal

@d2 I'd pretty much agree with @Bruce with a few alternatives as below:

 

  1. Get a basic L2 switch, we use MX HA pairs with a pair of MPLS WANs and do this, not had one fail yet and even if one does, you have the other link (we run Auto-VPN over both, either load-balanced or with a primary and failover)
  2. AutoVPN establishes from the HP pair rather than the individual unit when using vIP
  3. As @Bruce said
  4. You don't need VLANs, you can use single LAN mode and in that case the MPLS only needs to know about the /29 link as the LAN IP subnet will all go over the SD-WAN tunnel (if you configure it that way)
  5. Not nearly as granular reporting but integrations with umbrella etc. help.  There are three license tiers, Enterprise (essential services), Advanced (you'll need this for decent internet control) and SD-WAN+ (needed for splitting out internet access so that high bandwidth flows like updates and O365 goes local but critical to control goes central.) 

Thank you Bruce and cmr for the detailed responses and suggestions. This pilot is being driven primarily as a cost saving exercise to reduce WAN links and thus costs, so having an L2 switch in front with an MPLS and internet circuits on both actually drives up cost (but I understand that this is the architecture needed to provide redundancy so thank you).

 

@Bruce was your points 2 & 3 to say do away with the L2 switches in front and then terminate the MPLS and new primary internet link on MX1 and then just the new second internet link on MX2? This will cover the primary MX failing, and then have site redundancy for AutoVPN and local breakout available via the second internet link.

 

@cmr - we want to do local breakout as part of the pilot (again reduce central internet costs), so looks like we will need SD-WAN+, I'll look at that license. I noticed on the link below that app based local breakout is coming soon, does that mean it's currently available in Beta code or not yet release? Are you using this, or anyone else reading this?

 

https://meraki.cisco.com/en-au/product/security-sd-wan/license/secure-sd-wan-plus-license/

Bruce
Kind of a big deal

@d2 Exactly that. In normal circumstances you have your primary internet link and your MPLS link on MX1. You run AutoVPN across them both (assuming there is internet access from the MPLS network somewhere - either directly or via the head-end) and you can use the SD-WAN capability. You have a backup internet link on MX2 so that in the event that MX1 fails you get internet access (local breakout) via MX2 and an AutoVPN tunnel is brought up to connect back to your head-end.

 

The other failure scenarios are (assuming primary internet is in WAN 1 on MX1): primary internet fails, all traffic goes via the MPLS link. MPLS link fails, all traffic goes via the primary internet. Both primary internet and MPLS link fails, VRRP will make MX2 active and all traffic will go via backup internet link.

cmr
Kind of a big deal
Kind of a big deal

@d2 you can have local breakout for all internet traffic on any of the licenses, it is just that if you have the Advanced license then your users are better protected (IDS etc.) and if you have SD-WAN+ you can have selective local breakout; i.e. query to external service that needs to come from corporate IP can go central, Windows updates can go local.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels