SD-WAN over MPLS/ASE - routing and failover concerns

MerkiWaters
Comes here often

SD-WAN over MPLS/ASE - routing and failover concerns

We're determining the best way to implement default route and SD-WAN route failover in our network.  I've attached an image of our WAN topology.

 

SD-WAN generalized.png

 

Right now, an AutoVPN tunnel is formed between HQ and the Datacenter over the ASE connection.  The ASE connection is connected to a WAN interface on the HQ MX.  At the Datacenter, the ASE is connected to an MS350 switch pair that has an SVI for the vlan that traverses the link.

 

An outage involved losing internet connectivity at the Datacenter.  The HQ MX failed over to the other WAN port, but the only Hub specified was the Datacenter Hub and it was set for IPv4 default route.  Return traffic is directed via static route on the Datacenter core switch to the SD-WAN Hub there.  The DC Edge firewall, SD-WAN Hub and Core switch were all unable to connect to the Internet and the Meraki Registry.  We had no Internet connectivity at HQ via the fail-over circuit until we disabled the IPv4 default route check box in Site-to-Site VPN there.

 

  1. So that failover is automatic - should the Azure vMX Hub be set as a secondary Hub for HQ and with IPv4 default route disabled for it?
  2. Would it be advisable to remove the SD-WAN tunnel over the ASE between HQ and the Datacenter and simply use the ASE as a L2 link that it's intended for?

 

 

Here are some notes regarding the sites and equipment:

 

Primary sites:

 

HQ site

Connected to the Datacenter site via a private circuit (ASE)

Secondary public internet connection available for failover

SD-WAN spoke utilizing private circuit as primary uplink

 

Datacenter site

Hosts many VM's and through which the HQ reaches the Internet

One MX pair is the edge firewall and another pair is a one-armed VPN concentrator

Edge firewall pair provides for Client VPN

 

Azure tenant

Hosts SQL databases and some VM's

 

Small branch sites

Connect to Datacenter over SD-WAN

 

 

From Upstream to Downstream this is our topology:

 

Datacenter site

Edge MX250 pair:

-WAN 1 primary uplink which provides Internet inbound/outbound

-Utilizes two VLANS, management and DMZ

-Not participating in SD-WAN

 

Core switch MS350 stack pair:

-Static DFG of Edge MX250 LAN

-SVI provides the default gateway for HQ's primary uplink and L2 connectivity for both the public Internet circuit and the private circuit back to HQ

 

One-armed VPN Concentrator MX250 pair:

-All RFC-1918 is statically routed here from the Core switch

-Hub for all sites

-Hub is also the IPv4 DFG for HQ

 

Private circuit (ASE) connects HQ site via L2…

 

HQ site

HQ Core Switch Cisco Catalyst:

-L2 Connectivity for private circuit and for secondary public circuit

-L3 for HQ site

-Transit vlan to site's MX firewall's LAN

 

MX250 Firewall pair:

-SD-WAN spoke

-Datacenter's Hub is specified as only Hub and is the IPv4 DFG

-Private circuit to DC is the primary Uplink

-Public circuit is secondary (1/5th bandwidth of private circuit)

1 Reply 1
GIdenJoe
Kind of a big deal
Kind of a big deal

From what I can gather from your drawing and part of the extensive explanation you gave your offices and branches don't use direct internet access but are fully tunneled to the DC one armed concentrator HA pair and then exit to the internet there?

The first issue I see with that design is that the branch office only has an internet circuit to reach that datacenter, so if that datacenter has internet outage then that branch has no connectivity.
In case of the main office, it would be able to use the autoVPN tunnel between it and the DC via that private link, however traffic would stop there as the internet failed at the DC.

Then for one of your questions:
You cannot use an Azure vMX to break out to the internet from remote locations.  Azure cloud only allows internet access for actual resources hosted there.

Then for design improvements.
- Normally you have a 2 DC setup where you can have a single or HA-pair of MX concentrators and in case of active/active DC's you'll need an L2 circuit in between so hosts can move from 1 DC to the other while keeping their own IP address.  You can also have some branches select 1 DC while the others choose the other DC.  You could also use VXLAN between the DC's and an L3 circuit.
- If you remain on 1 DC then you will need to have 2 upstream ISP connections.  Since the concentrator mode MX only has 1 local IP and one WAN connection the upstream infrastructure needs to provide ISP redundancy, preferably with a BGP dual homed IP range so the tunnel quickly recovers when the primary ISP circuit fails.  If you use 2 circuits with different IP's it will also work but the tunnels will need some time to recover since internet outage detection can take a few minutes and the VPN registries need to adapt.  However your HQ should be less affected since it primarily uses the private link as primary autoVPN tunnel so that only needs to wait for the internet down detection but you will feel it.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels