vMX, Azure and route tables

Solved
JonnyM
Getting noticed

vMX, Azure and route tables

I have a vMX deployed in its own subnet in Azure, the vMX has IP 10.6.250.12/29, the resources in Azure that it needs to access are in 10.6.250.128/25, and VPN connected users are allocated an address in 10.6.249.0/24.

 

The vMX is configured as a concentrator because I can't have it NATing everything out of its single interface as it becomes difficult to isolate captures from individual clients if everything has the same source IP - it also papers over missing routes and this is going to be a very stable setup once it's built.

 

I have created a route table with a single route for my VPN clients, which is to say that 10.6.249.0/24 has a next-hop IP of 10.6.250.12. I assigned this to the subnet with my resources in, I also made sure this subnet was added in the vMX as a local network in the site-to-site VPN settings, and that it was a route that my VPN clients knew about.

 

The VPN connects successfully, but I can't ping anything in 10.6.250.128/25. Running a packet capture on the vMX shows the packets leaving, but nothing coming back. Pinging an address in the VPN subnet from the resources in Azure results in nothing showing up in the packet capture. The only thing that brings everything to life is adding the route table onto the same subnet as the vMX is in - something that the docs explicitly say not to do.

 

Unfortunately I have no hard proof but I am convinced something has changed in Azure here - I deployed a virtual SonicWall a month ago and had to do the same thing (add the route to the same subnet as the firewall interface) where previously I am sure you only needed to tell the subnet where the traffic was originating from what the next-hop address was, rather than needing to 'walk' the packets through the subnet your appliance lives in, it's possible I am skipping a step and forgetting but my deployment is quite simple and looks the same as the instructions. The Azure docs also say that adding the route to the one subnet hosting the resources is enough, and yet I can run a ping and make it either timeout or work depending on whether this route table has also been associated to the vMX subnet.

 

Has anybody seen this?

1 Accepted Solution
JonnyM
Getting noticed

This is resolved, it was an NSG issue relating to the VirtualNetwork service tag only containing custom routes if the NIC was in the subnet where a custom route was being applied to, even if it was in the same vnet. So adding the route to the subnet the vMX was in was making things work because the destination was being included in the service tag, not because of a routing problem.

View solution in original post

4 Replies 4
KH
Meraki Employee
Meraki Employee

Hey @JonnyM 

Out of curiosity, what is the next-hop type for your original route? Can you confirm if this is configured as VirtualAppliance? This can help to find the route:
https://learn.microsoft.com/en-us/azure/network-watcher/diagnose-vm-network-routing-problem#view-det...

If you found this post helpful, please give it kudos. If my answer solved your problem, click "accept as solution" so that others can benefit from it
JonnyM
Getting noticed

The route is a virtual appliance route, you can see on the packet capture from the MX exactly when I drop the route table off the subnet the vMX is in.

 

Effective routes and next-hop diagnostics from Azure portal all look as they should, but traffic doesn't route unless the vMX subnet also has the route table.

 

Screenshot 2024-08-16 153645.png

 

Screenshot 2024-08-16 153724.png

 

Screenshot 2024-08-16 153923.png

 

JonnyM
Getting noticed

I've just done a simple test by creating a new resource group and new vnet. Vnet is 10.0.0.0/24 and it has a router subnet at 10.0.0.0/29 and a client subnet at 10.0.0.8/29. Two Ubuntu LTS VMs have been deployed - router at 10.0.0.4 and client at 10.0.0.12. I have enabled IP routing on the NIC on router and added a route table with a single route - 192.168.0.0/24 with next-hop address of 10.0.0.4 and attached it to the client subnet.

 

Running tcpdump on router and pinging an address in the 192.168.0.0/24 subnet from client shows no packets arriving. Pinging 10.0.0.4 shows packets as expected. Adding the route table to the router subnet also results in the ICMP packets arriving at router. Removing this route keeps the ICMP packets arriving for a few minutes, but only if they are destined for an IP in 192.168.0.0/24 that has been pinged recently, after about 15 minutes the routing fails again.

 

I recreated the setup in US East in case my issues were a UK South anomaly, but it made no difference. My understanding is that this is not how Azure is meant to function but I'm still convinced I'm doing something wrong because surely this would have been noticed by now.

JonnyM
Getting noticed

This is resolved, it was an NSG issue relating to the VirtualNetwork service tag only containing custom routes if the NIC was in the subnet where a custom route was being applied to, even if it was in the same vnet. So adding the route to the subnet the vMX was in was making things work because the destination was being included in the service tag, not because of a routing problem.

Get notified when there are additional replies to this discussion.