I have a Meraki connected over an LTE network with a 5M down X 2.5M up priority subscription. I believe that the LTE network has strict policing on those limits. The customer is running an IPSec VPN behind the Meraki, with only that traffic. The customer is complaining about packet loss and high latency.
We're looking at the uplink statistics. We've configured several of the hops on the Internet path, and it looks like we're seeing consistent latency, but up to 1-5% packet loss on the uplink polling stats. That loss is coming from the first hop across the LTE network. Loading on the uplink is 1.5 to 4.5M down, and we're seeing the packet loss at both low and high utilization.
It could be that we're just having high packet loss on the LTE network, and there's nothing we can do about it. I'm wondering though if we can improve performance with traffic shaping. The NOC people have been setting traffic shaping policies, but I don't think they've been doing it correctly. I don't really understand what the Meraki is actually doing either.
I want traffic shaping, not policing. First we have the uplink configuration. Is that policing or shaping? If it's shaping, then I don't need anything else. Then we have the global bandwidth limit. If I have the WAN 1 limit set, do I need the global bandwidth limit?
If the WAN 1 and global bandwidth are not shaping, I presume I would set a traffic shaping rule for the IPSec traffic. The ops guys have been doing all three, 5M down 2.5M up on both WAN1 and the global bandwidth limit. Only all the traffic is IPSec, and last I looked, they had also set a rule on the IPSec traffic to 1M down, .5M up. Don't know why they would have done that. They had a 5M limit on WAN1 and 5M global limit down, and a 1M traffic shaping rule for IPSec. Total traffic was getting capped at 2.5M down. I don't get that. Where did that 2.5M limit come from? It seems like I'm not understanding something.
Here's the config as it sits right now with global bandwidth set, but no traffic shaping rule:
Sometimes people calculate traffic shapping slightly differently (for example, some systems only count the payload others count the header+payload).
Perhaps try loweing your numbers slightly on WAN1 and see if that makes any difference.
Is there any chance you are simply generating a lot more load for short periods of time and the traffic is being correctly dumped? Traffic shapping can only buffer the traffic for so long before dumping it.
Also note that IPSec does not always respond that well to traffic shapping. It has something called a "replay window". If packets get reordered by something (such as some smaller packets get sent before some larger packets) and the re-ordering is greater than the replay window the whole VPN will oftten tear down and then rebuild.
Also if your cellular provider is using carrier grade NAT and the UDP sessions are torn down after a fixed period of time you'll see constant VPN rebuilds happening, each time causing traffic loss. If your cellualr provider gives you public IP addresses with no firewalling you tend to have less issue (this is often done by selecting a specific APN).