New MX firewalls seem to be affecting other routers on the LAN

SOLVED
Pugmiester
Building a reputation

New MX firewalls seem to be affecting other routers on the LAN

Hi all, bear with me, it's a bit of a long one.

 

We have a pair of MX-250's in HA as Primary/Backup. They sit in two buildings, each with their own subnet and VLAN. Both subnets and VLANs are piped into both MX's across our site wide fibre link so they can work their magic with HA. They have been on the LAN for over 2 weeks now and on Monday evening they were swapped into place of our old firewalls, essentially shutting ports on the internet switch and opening the ones for the MX's.

 

Since then, we've had a really weird random issue with two other routers onsite. They are a pair of third party managed Cisco routers, each with their own dedicated external circuit and an interface in each of the buildings VLAN's and they have been running as they are for over a year without issue. They are configured with HSRP for both VLAN's so our default gateways have a single route in each VLAN to drop traffic on them and they route it to our other offices over MPLS.

 

Since Monday night, randomly, we loose the ability to reach the Cisco routers so we lose access to other office resources. The only fix I've found is to shut and open the port connecting the Cisco router to the LAN. There's nothing logged on either the switch or the router before the failures begin, all they show is the port state changing as I do it. The third party have been all over their routers and can't find a fault and I'm in the same boat with the switches they connect to.

 

The only recent change was making the MX's our default route to the internet (not to the sites supported by the Cisco routers) on Monday night so I'm certain it's connected but I've been banging my head on this one since I made the MX's live on Monday evening and I'm just not getting anywhere, hopefully someone has seen it before and can nudge me in the right direction.

1 ACCEPTED SOLUTION
Pugmiester
Building a reputation

Thanks jdsilva, that was where I was heading. It's been a long week. 😞

 

Last night we rolled back to our old firewalls and disconnected the MX's completely from the LAN to be certain we could regain a stable connection for work being completed remotely over the weekend.

 

Not 10 minutes after I finished, one of the links dropped again proving that the MX's have noting to do with the problem. I never believed they did but it was the only change I'd made.

 

I've a solution in place for now using the interface IP addresses in place of the HSRP ones so we're stable but lacking failover. The business is happy with stable for now.

 

I'll close off this question though as we're now 100% certain it's not Meraki related.

 

Thanks everyone for your help pointing me in the right direction.

View solution in original post

5 REPLIES 5
Adam
Kind of a big deal

Do you have a static route to get to the Cisco router interfaces?  I wonder if once you create that 0.0.0.0 route it is eventually aging out some kind of cached route to the Cisco router.  Although I'm not sure why bouncing the port would help this.  

Adam R MS | CISSP, CISM, VCP, MCITP, CCNP, ITILv3, CMNO
If this was helpful click the Kudo button below
If my reply solved your issue, please mark it as a solution.
jdsilva
Kind of a big deal

So you have MXes in Warm Spare (running VRRP between them) and you have two Cisco routers running HSRP between them, with both pairs doing so on both LAN VLANs?

 

Which pair are supposed to be the default gateways for your clients? Are you doing some creative redirect routing here? Perhaps a simple diagram may help.

 

That aside, I think I would start with checking MAC address tables when the problem is occurring and ensure the switches all have the correct entry for the Cisco MACs (and HSRP virtual MAC). If that's correct I might start taking some packet captures on the switch interfaces facing the Cisco routers to see what's on the wire. 

Pugmiester
Building a reputation

Hi all, thanks for the quick replies.

 

There's no sign of any issues with ARP entries when we have the connection failure but it's difficult to test as we're in the middle of the work day when we're seeing most of the failures and the business guys get a bit impatient.

 

With that in mind, we've decided to roll back the changes and revert to our old firewalls this evening until we have time to investigate further.

 

As soon as I have that complete, I'll get back with some more information to hopefully figure this out once and for all.

jdsilva
Kind of a big deal

Hi @Pugmiester,

 

I did not mean ARP entries, I mean MAC address table entries. They are not the same.

Pugmiester
Building a reputation

Thanks jdsilva, that was where I was heading. It's been a long week. 😞

 

Last night we rolled back to our old firewalls and disconnected the MX's completely from the LAN to be certain we could regain a stable connection for work being completed remotely over the weekend.

 

Not 10 minutes after I finished, one of the links dropped again proving that the MX's have noting to do with the problem. I never believed they did but it was the only change I'd made.

 

I've a solution in place for now using the interface IP addresses in place of the HSRP ones so we're stable but lacking failover. The business is happy with stable for now.

 

I'll close off this question though as we're now 100% certain it's not Meraki related.

 

Thanks everyone for your help pointing me in the right direction.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels