Multiple hub sites. Question regarding ability for hubs to communicate amongst each other

ronnieshih75
Building a reputation

Multiple hub sites. Question regarding ability for hubs to communicate amongst each other

Hello, we recently brought up additional MX450s with a second and third hub sites.  I had opened a case with meraki support a while back because I had no idea that the LAN side networks must be diverse in order for hub option to even be abled.  Therefore, a tech suggested that we disable the ability for hubs to communicate among each other.  However, this feature had nothing to do with the ability to enable the new hubs.  We then created 2 other diverse backend /29 transit networks for the two new MX450s and enabled the two new hubs.  

 

My first question is:  what does this ability for hubs to communicate amongst each other do?

 

Second question is:  I have some strange routing issue with spokes still needing to talk to the first original hub because I apparently had defined static routes for specific destination on that original hub in the past and it is the hub that endpoints are shooting over to for communication with those specific destinations.  However, I have summary static routes defined on the 2 new hubs and spokes are not taking the path through the 2 new hubs for those specific destinations.  The minute I disable those specific routes, endpoints aren't using summary route to get to those destinations.  Should I be speaking to our hosted data center backend to see what's up with this?

 

thanks

12 Replies 12
alemabrahao
Kind of a big deal
Kind of a big deal

By default, all hubs establish VPN among themselves according to the documentation, that is, this is how Meraki works.

Have you confirmed that these summarized routes are configured to advertise on the VPN?

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
ronnieshih75
Building a reputation

I have a static route 10.0.0.0/9 as summary route on all three hubs, it's advertised through vpn.

I have another static route of 10.130.0.0/16, it's advertised through vpn, but only on hub #1.  10.0.0.0/9 really covers 10.130.0.0/16, but for the life of me I cannot recall why I placed this on hub#1, perhaps something didn't work with the summary route by itself.

 

Also, what appears to be happening is that spokes somehow know that they need to get to 10.130.0.0/16 through hub #1 only since they all homed to hub#1 in the past and that route somehow remained in the list.  I took off hub#1 and place in just hub#2 or #3 as sd-wan hub in the past few days for some spokes, then traffic from spokes to 10.130.0.0/16 breaks.  I have to have hube#1 in the hub list for the spoke to get to 10.130.0.0/16.

 

I'm not sure if the above has something to do with meraki support turning off inter-hub communication.  It appears to some degree yes.  If inter-hub communication is turned on, then a spoke homed to hub#2 knows to route over to hub#1 to get to 10.130.0.0/16.  But this is not ideal and is not what I want the traffic to flow.  A spoke homed to hub#2 simply needs to get to 10.130.0.0/16 via the summary 10.0.0.0/9 route via hub#2, no need to route through hub#1.

Ryan_Miles
Meraki Employee
Meraki Employee

10.0.0.0/9 wouldn't cover 10.130.0.0/16. 10.0.0.0/9 only goes up to 10.128.0.0.

 

So, when you remove hub 1 from a spoke's hub list the spoke no longer has a route that covers 10.130.0.0. If your hubs were using their default hub to hub tunnels there still would be a path via hub 2 and 3.

 

All three of your hubs have various 10.x.x.x  routes all pointing to the same next hop (per hub location). Could you just use a 10.0.0.0/8 instead to cover it all with one route?

 

The hub to hub tunnel disablement is typically done to deal with routing loops when the hubs use OSPF and there is other DC side OSPF peering happening. It can be dealt with by changing metrics on the peering gear or moving to BGP (albeit that requires concentrator mode, which would be a larger discussion on if that would work better for your design than NAT mode hubs). I always aim for concentrator mode for a variety of reasons, but I know sometimes folks want/need NAT mode for various reasons.

ronnieshih75
Building a reputation

I must have a brain fart this morning.  You are correct 10.0.0.0/9 does not cover 10.130.0.0/16 .  I recall now that the route was in there when I inherited this whole infrastructure.  

 

Good point on using 10.0.0.0/8 however that might off traffic for our other portion of the network.  Yep, the people who came before me built Azure infrastructure on the other side of our data center using 10.129.0.0 and on.  This theoretically would work but might cause routing issues.  I'll put it on the to be tested list.

 

With the above said, what are my proper courses of action?

- do I have meraki support re-enable inter-hub communication? does this serve any purpose?  OR

- do I simply need to pop in the static route of 10.130.0.0/16 on the two new MX450 hubs?

Ryan_Miles
Meraki Employee
Meraki Employee

The easiest option would probably just be to add that route to hubs 2 and 3.

 

For the hub to hub tunnel feature. I'd review that with your Meraki SE as it's a larger design consideration. Without a very thorough understanding of your entire network I wouldn't want to make a recommendation of that magnitude. 

 

Also, noticed hub 2 and 3 have OSPF disabled. Is that temporary or are you using some other process to direct spoke routes back to those hubs?

ronnieshih75
Building a reputation

Hub#2 and Hub#3 both have OSPF enabled to advertise routes into the nexus 9k backend switches.

 

I actually believe the inter-hub communication, which was enabled by default, before meraki support disabled it, had iBGP peering established amongst all the hubs.  Because I see remnants of this route as 'BGP' for some spokes.  So if static route for 10.130.0.0/16 only existed on hub#1, then spokes established with hub#2 and hub#3 would flow over hub#1 to hit 10.130.0.0/16 .  Even though we know this inefficient, however, it is almost like a 'backup' if somehow static routes on hub#2 and hub#2 goes haywire.  

Ryan_Miles
Meraki Employee
Meraki Employee

I didn't catch the new hubs were on MX18.x so the routing stuff is on the new separate page 😉

ronnieshih75
Building a reputation

Experimenting again late night.  I added static route of 10.130.0.0/16 to both hub#2 and hub#3

 

I noticed the following:

- for a spoke with hub#3 as primary and hub#1 as secondary, I see routes as:

active, 10.130.0.0/16, meraki vpn: static route, next hop of hub#3

active, 10.130.0.0/16, meraki vpn: static route,  next hop of hub#1

inactive, 10.130.0.0/16 , BGP  - next hop of hub#1

 

- for a spoke with hub#2 as primary and hub#3 as secondary, where I removed the original single hub#1.  I see routes as:

active,  10.130.0.0/16, meraki vpn: static route, next hop of hub#2

active,  10.130.0.0/16, meraki vpn: static route, next hop of hub#3

inactive, 10.130.0.0/16, internal, BGP - next hop of hub#1

active, 10.130.0.0/16, internal, BGP - next hop of hub#3

active, 10.130.0.0/16, internal, BGP - next hop of hub#2

 

I understand the static routes since I added them, but I am not understanding the plethora of BGP routes especially for spoke where I removed hub#1 and added hub#2 as primary and hub#3 as secondary.  I am getting sporadic routing behavior for the second scenario where the BGP routes are there.  Do I actually have inter-hub communication turned on for hub#2 and hub#3 ???  hence the BGP routes?  I don't see where BGP routes are popping in from because we run all hubs in routed mode with only OSPF enabled talking to backend Nexus switches.

 

 

Ryan_Miles
Meraki Employee
Meraki Employee

I think you're seeing stale info in the UI route table page. Example, I checked one of the spokes you were testing with and I see multiple routes for 10.130.0.0/16. However, when checking the actual route table on the backend I only see the proper route via the only configured hub 1. I've noticed BGP routes can take a long time to flush from the UI even after deactivated/removed.

 

MX17+ uses iBGP for AutoVPN routes (meaning the MX to MX routing). Your two new hubs are running MX18.x vs. hub 1 which is on MX16.x. A number of your spokes are also on MX17.x+. So , I assume this is where all the IBGP routes are coming from.

 

Your hub to hub tunnel disablement is applied org wide and I don't see hub routes on other hubs so I believe that is working as intended. 

ronnieshih75
Building a reputation

So what do you think is the problem with traffic to 10.130.0.0/16 when hub#2 and hub#3 are configured as a combo for spokes?

 

I have not upgraded hub#1 due to fear of issues without backup hub to failover to for spokes.  This is on the to-do list.  My plan right now is to configure all spokes with hub#1 in the hub list, then upgrade firmware on hub#1.  I was suggested to run the new hubs on v18.106 by another meraki support tech due to performance improvements for MX450 with firmware v18.106 .  If upgrading firmware on the hub#1 fixes this issue then we'll get that done asap.

ronnieshih75
Building a reputation

Through a lot of testing.  I noticed the following odd behavior:

 

- When I configure a spoke with hub#3 as primary and hub#1 as secondary, getting to anything in 10.130.0.0/16 is snappy.  No issues, no additional BGP routes in the route table

 

- When I configure a spoke with hub#2 as primary and hub#1 as secondary, getting to anything in 10.130.0.0/16 is snappy.  No issues, no additional BGP routes in the route table

 

- When I configure a spoke with hub#2 as primary and hub#3 as secondary, getting to anything in 10.130.0.0/16 has a delay and seems to behave like a failover happened, when our application finally loads in Edge browser, sometimes with a 'network has changed' message.  I see additional BGP routes on top of the static routes.

 

I believe those self created BGP routes are getting in the way with inter-hub communication turned on, for hub#2 and hub#3.  Even though static routes by default has the lowest admin distance over BGP routes.

Ryan_Miles
Meraki Employee
Meraki Employee

I sent you a PM

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels