SD WAN, QoS and traffic shaping

Billy
Getting noticed

SD WAN, QoS and traffic shaping

I am currently testing the SD-WAN capabilities of Meraki and its abilities to select the best link based on WAN links’ performance.

I have 2x low cost WAN links on a remote site 50/20Mbit and a high-performance Fiber link 100/100Mbit on the Hub site. Both sites are configured to use NAT, while the HUB is configured as a Hub and remote site as Spoke.

On the remote site, I have defined WAN1 as the primary uplink and have enabled Load balancing.

5x VPN traffic Uplink selection policies have been defined, each one configured to load balance on uplinks that are suitable for their respective performance class. Uplink selection policies have both application based and custom definitions. For example for the Voice, I have defined to use the uplink that’s best for VoIP traffic, using the traffic filters: Skype, SIP (Voice), (UDP from Any to Any:5060-5061)

5x performance classes have been defined, including the default voice class.

Additionally, 6x traffic shaping rules have been defined, reflecting the above Uplink selection policies (same application assignments), where Voice (EF) and AF41 classes are assigned as high Priority (2/7 of the total bandwidth each),  AF31 and AF21 are assigned to normal (1/7 each) and the rest of the traffic as Low. No bandwidth limits have been applied.

 

Based on my understanding of the documentation, I would expect that the outgoing traffic through the VPN should be marked as defined on the traffic shaping rules and in the occasion of link congestion the bandwidth allocation should be as defined in the priority classes (for example on Voice’s case it would be 20Mbit x 2/7 = 5.7Mbit).

First question: Given that I have 2x 50/20Mbit WAN links, would that bandwidth be 2 x 5.7Mbit? What is the expected behaviour?

Second question: On the Security appliance/ VPN Status, shouldn’t I see the applied policy per connection, as defined above?

 

 

27 REPLIES 27
Billy
Getting noticed

Reference Topology and settings' screenshots:

Uplink Selection PolicyUplink Selection PolicyPerformance ClassPerformance ClassUplink Decision ICMPUplink Decision ICMPUplink Decision DNSUplink Decision DNS

 

PhilipDAth
Kind of a big deal
Kind of a big deal

With regard to your second question, go:

Security Appliance/VPN Status

You will see the flow decisions on the bottom.

 

Screenshot from 2018-01-09 12-10-32.png

If you click on an individual VPN link (except for on the description field) on that same page you should see something like this:

Screenshot from 2018-01-09 12-11-44.png

Hi @PhilipDAth,

 

My question is why the Policy that appears to be applied doesn't much to the defined one. As shown on the previous post's configuration, both ICMP protocol and DNS (any - any UDP/TCP 53) are configured to use the Load balance on uplinks that are suitable for "LOW AF21" Uplink selection policy, however only ICMP appears to have that policy applied.

ICMPICMPDNSDNS

Billy
Getting noticed

I can see in your case, @PhilipDAth that the policy for RDP is applied to the VPN connections:

RDP on VPN-Philip.jpg

While in my case it appears that it hasn't been applied for some reason:

RDP on VPNRDP on VPN

PhilipDAth
Kind of a big deal
Kind of a big deal

Is the traffic you are trying to apply it to going over AutoVPN, or over the Internet?


@PhilipDAth wrote:

Is the traffic you are trying to apply it to going over AutoVPN, or over the Internet?


AutoVPN

PhilipDAth
Kind of a big deal
Kind of a big deal

You need to configure the VPN flow preferences under:

Security Appliance/Traffic Shapping/Flow Preferences/VPN Traffic

 

Like this:

Screenshot from 2018-01-09 15-53-27.png


@PhilipDAth wrote:

You need to configure the VPN flow preferences under:

Security Appliance/Traffic Shapping/Flow Preferences/VPN Traffic

 

Like this:

 


I noticed that you haven't specified a port in your traffic filter rules. If I use a generic definition of something like 192.168.0.0/16:any to 192.168.0.0/16:any, it will work. However, I am trying to make it work based on either a specific port (TCP/UDP:3389) or a specific protocol (Remote Desktop). Generally I have specified 5 different performance classes based on applications and protocols and the expected performance, including SNMP, Citrix, Remote Desktop, DNS etc etc

For example:

Uplink Selection PolicyUplink Selection Policy

PhilipDAth
Kind of a big deal
Kind of a big deal

You should be able to specify a port - but you will probably have to wait for any existing cached flows to age out first.  Try configuring it, and then giving the MX a reboot to make it take affect immediately.

 

Note that modern RDP clients use UDP/3389 for their main transport - not TCP/3389.  So create two rules, one matching UDP/3389 and TCP/3389.  That is assuming that you don't have an RDS gateway, and then it will use TCP/443.

 

In my case, the RDP server is a dedicated RDP server, so matching the whole IP address is more straightforward.


@PhilipDAth wrote:

You should be able to specify a port - but you will probably have to wait for any existing cached flows to age out first.  Try configuring it, and then giving the MX a reboot to make it take affect immediately.

 

Note that modern RDP clients use UDP/3389 for their main transport - not TCP/3389.  So create two rules, one matching UDP/3389 and TCP/3389.  That is assuming that you don't have an RDS gateway, and then it will use TCP/443.

 

In my case, the RDP server is a dedicated RDP server, so matching the whole IP address is more straightforward.


I have tried using both Meraki's "Remote Desktop" traffic filter and protocol/port based definition with the same result, even after rebooting the MX

 RDP port def.jpg

I have also tried splitting the protocols to UDP and TCP or using more specific expressions like 192.168.134.0/24 UDP/3389 to 192.168.0.0/16 any. Same result

PhilipDAth
Kind of a big deal
Kind of a big deal

What software version are you using on the MX?


@PhilipDAth wrote:
What software version are you using on the MX?

MX 14.20

Have you tried using 13.28, the current stable release candidate?  I haven't used the 14.x series of beta code yet.


@PhilipDAth wrote:

Have you tried using 13.28, the current stable release candidate?  I haven't used the 14.x series of beta code yet.


I just installed 13.28, same result. The Policy applied is Fail over if uplink is down rather than the defined one

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

These are question I hope to get answered. The specifics on flows and connections and how it load balances and prefers more specific rules needs to be explained better.

 

I do know there is no packet replay or resets going on to manipulate the stream so I’m more focused on the decision making process in circumstances.

 

There are also other considerations when configuring multiple datacenters OR your headend has multiple ISPs. The traffic rules would need to either match or be configured appropriately for the links. Therefore you could have a different ISP route egress vs ingress. We also do keep all AutoVPN tunnels up and active even if “idle”.

 

We are working internally on more detailed customer facing documentation as we see more  use cases like this. We do appreciate the feedback and patience as we get your answers.

 

 

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

AutoVPN not Internet. This is a very good question. I will be chatting with the PM team tomorrow and get a definite answer. My assumption is by default it is just like the internet load balancing based on flows not connections, until a more defined rule is applied. I'll get back soon.
Billy
Getting noticed

That's the network design and what I've been trying to achieve.

It would be really useful if there was an option where we could monitor, for troubleshooting purposes, the traffic shapping rules statistics: que depth, total drops, no buffer drops, exheed drops, drop rate etc. 


Network Design.jpg

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

I am still waiting to hear back from the PM team on the behavior of the SD-WAN setup you have.

 

I saw you had rules in there for EF, keep in mind if your already marking DSCP egress from the phone there is no need to re-mark, which is what adding in a rule in there would do. When the MX receives the marking it will apply it to the appropriate queue. 

 

As it pertains to your other ask, "que depth, total drops, no buffer drops, exheed drops, drop rate" this is not something we are looking into providing visibility on near term. Our focus is on simplicity, providing this visibility with no easy way to consume the data makes the solution more complex. The only need for this information is if there are issues, when there are issues Meraki support is involved to provide resolution.

 

What a majority of our customers are asking for is "how are my vital applications performing and tell me when it's not performing well" through the overlay or direct to the internet. Application issues could be tied back to a multitude of problems in and out of the network. In the network we want Meraki own the problem and resolution. We want to make it easier for you to prioritize your applications and be notified when they are not working to expectation; which means help desk calls.

 

I would stay tuned and keep an eye on our announcements in the next month. We may not have exactly what your looking for around QOS/COS visibility, however we may have something that enhances your ability to support the applications that are important to your business. This would be end-to-end visibility not per hop metrics your asking for.

Billy
Getting noticed

Another bug that I want to point out in this scenario where 2xWAN links and a cellular interface are used is that if "Prefer WAN 1. Failover if poor performance for <performance class>" is defined, instead of "Load balance on uplinks that are suitable for <performance class>", traffic will never failover to the cellular interface.

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

In the version you have access to we do not perform AutoVPN over USB-Cellular. It is working as expected.

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

Also, there are different types of failover. So please refer to the document below that explains how connection monitoring works. This is not how the failover withing the AutoVPN- SDWAN overlay works, just internet and fail to cellular.  I'm not sure how your testing failover but I assume you had either a soft or hard failure for both WAN1/2.

 

https://documentation.meraki.com/MX-Z/Firewall_and_Traffic_Shaping/Connection_Monitoring_for_WAN_Fai...

Billy
Getting noticed


@DCooper wrote:

In the version you have access to we do not perform AutoVPN over USB-Cellular. It is working as expected.


I performed a hard failure test, pulling off both WAN1/2 cables. It failed over USB-Cellular interface and when I had configured the uplink selection policies to use "Load balance on uplinks that are suitable for <performance class>" I was able to access the remote resources through VPN

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

I knew the AutoVPN over usb was coming, just didn't realize it was in this public beta. So did it only fail over when the rules were there? I assumed it would failover to USB and be all or nothing and not pay any attention to load balancing or rules you have setup. I haven't tested but have it in the lab so can reproduce what you have setup.

Billy
Getting noticed


@DCooper wrote:

I knew the AutoVPN over usb was coming, just didn't realize it was in this public beta. So did it only fail over when the rules were there? I assumed it would failover to USB and be all or nothing and not pay any attention to load balancing or rules you have setup. I haven't tested but have it in the lab so can reproduce what you have setup.


It currently runs v13.28. If I define a prefered link (WAN1) at the uplink selection policies, it looks like it's only trying to use WAN2 if it WAN1 fails. Load balance on uplinks option though, does utilize cellular in a case of WAN1/2 failure.

 

Edit: Same thing applied on the earlier beta version too v14.20

Billy
Getting noticed

Any update from the PM team?

DCooper
Meraki Alumni (Retired)
Meraki Alumni (Retired)

@Billy Going to need some additional time on this one. The question is rolling up to our Eng team who builds the feature.

Billy
Getting noticed

Any updates on this one? I'm looking into deploying that design on a site in the near future and knowing beforehand how this solution is expected to perform would be quite useful.

 

Thanks

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels