We are experiencing an extrange situation with a 2 Hubs Topology. We have one Regional Hub and the other Hub is in another Geo-Location, only for Backup (we are not injecting a default route). We have all the spokes configured with the Regional Hub as the primary (first on the Hub priority list) and the Backup Hub on the 2nd place of the priority list.
Both Hubs are configured as a Full Mesh (we have BGP enable). The problem is that we see that the traffic between sites is passing through the Backup Hub (this hub is in another location, so we are experiencing a lot of latency). When we take off the Backup Hub from the list (in both ends) the traffic starts to flow using the Regional Hub (with high priority).
The only difference between Hubs is that the High priority is an MX250 and the Backup is an MX450.
If the hubs both advertise the same routes, then the route via the highest priority hub is chosen.
If your backup hub advertises more specific routes then it will always be chosen.
Thanks for your response. The topology is very simple, basically, we have 2 Sites (Spokes) with 2 Hubs (in different locations, one is local to the region and the other is in another region). The Regional Hub must have a high priority and the other works as a Backup (BGP is enabled on both HUBs). On the Site-to-Site VPN menu (Spokes) we have the Regional Hub first on the Hub list and then the Backup Hub (on both sites).
I´m making the test from the MX itself, using the source IP and destination IP the LAN interface (VLAN X). When I do an ICMP test I see that the traffic is going (bidirectional) through the Backup HUB (i know because of the latency, more than 100ms), if a take the Backup HUB out from the HUB list, automatically the traffic goes through the Regional HUB (high priority), it is easy to see because i have less than 5ms. We don´t have any other more specific route. I can´t see why is selecting the less priority hub on the routing table (the High Priority HUB seems to be the preferred next-hop).
The only difference between both HUBs is that the Backup MX is an MX450 and the main HUB is an MX250. If I use MX250 for Main and Backup, I don´t have any problem, but when i use the MX450 as the "Backup", that´s when the problem arrives. I did another test, using 3 Hubs, 2 MX250, and 1 MX450, even if the MX450 is with the less priority of the 3, always is chosen.
Hi, yes. Everything is ok, at the moment a add the #2 Hub (#2 on the priority List) the traffic start to flow through that Hub. The strange thing is that if I use an MX250 as the #2 Hub, nothing happens, but if I configure the MX450 as the #2 Hub, instantly the traffic switches.
You say you’re using BGP to the DC core? Are the routes originated on the DC core, or being learnt by the cores? If they’re learnt then check the AS PATH for the routes that are being sent to the Meraki concentrator. It may be that the path length on the routes is making the MX450 more preferable. I don’t believe there is a way to do this on the Meraki side (although support can probably see it), so you may have to look at the DC cores, or do a packet capture of the BGP advertisements.
Why configured with eBGP and why not iBGP ?
iBGP needs to be used between your network core, eBGP generally uses for ISP's connectivity.
BGP on MX appliances is only meant to import the routes of a site or datacenter into SD-WAN and vice versa to have site subnets available to BGP peers in a HQ or Datacenter site.
@Neteng-ttcoDo you have any local networks configured on the hub? If so, what are their subnets and masks (e.g. 10.1.0.0/16). When you use BGP subnets advertised to the spokes they are either learnt via the eBGP connection to the DC core, or from manual configuration on the Site-to site VPN page. Is it possible the backup hub is learning the route with a longer mask (I.e. it’s a more specific route).
If you’re sure all the subnets have the same mask length, and everything else is working correctly, then I’d open a case with support as they may be able to see from the backend why the backup hub is being preferenced.
For a subnet you are trying to reach from a spoke, can you show a screenshot of the route it is matching on each of the two hubs (Security & SD-WAN/Monitor/Route Table) as well as for the spoke concerned, please.
We have Spoke A & B and the GW, here you can see the routing table of both Spokes and the main Hub (Chile). Networks 10.x.28.x and 10.x.29.x . You can the on the Hub that we have the same networks learned from the USA Hub (it´s ok) but with another priority. The problem is that the traffic flow goes through the USA Hub (we can see that the main hub sends flows through the USA Hub).
Hub Chile (Flow ICMP - source 10.x.29.30 to 10.x.28.30)