Hi, I'm extending a Meraki SDWAN network into Azure. As customer only has a vMX vm in the region, we have configured a non-meraki tunnel between onprem MX in the region against Az VPN GW to serve as BU circuit in case the vMX fails. Routing between vMX and Azure subnets is static, so vMX must configure Azure prefixes as local networks in AutoVPN. In order to avoid issues in MX spoke sites, the Azure prefixes configured as local networks in AutoVPN have been configured most specific (creating 2 /25 per each real /24 Azure prefix). Non-meraki remote Azure IPSec prefixes in physical MX are configured as real /24s. This way, in normal situation everything works as expected: Physical MXs prefer AutoVPN for Azure prefixes and Azure prefers UDR static routes for onprem pointing to vMX private IP, so connectivity works fine.
The issue appears when trying to test BU connectivity. When we stop the vMX VM, local advertised networks in the vMX routing table pass from green to not installed ("- " symbol) in the vMX Route Table. Good! However, the rest of the network still sees them as valid remote routes, causing the traffic to be blackholed. Couriously, spoke notice their preffered hub (the stopped vMX) is down for other spoke networks and change their routes to point towards their second hub. However, vMX local networks remain up in all spokes pointing to the stopped vMX.
Provided I configure the vMX off from autoVPN, all other AutoVPN devices in the network tear down the vMX local networks and Bu works (once we have deleted the RT in Azure subnet)
IMO, it is rare the own vMX could set their local networks as inactive, while the rest of the network keeps unaware from vMX stop. I know Meraki always starts publishing the lan prefixes of a site just when you associate the network to a template (even when the MX is not already installed). However, in this case, the VM is stopped from Azure, it declares the local networks as inactive...however at the end of the day they are present in all other MX tables unless you remove the vMX from AutoVPN. I'm afraid we have to explore the Azure Route Server design. The issue here is we had deployed RS in another region having inter-regional vnet peering against this region and we have to ensure RS-to-RS does not modify current regions routing (routing in the other region is quite complex)
I've already open a ticket to check if there is any way to tear down local advertised routes automatically. Have you ever had a similar issue?
Thanks!