Having Issues With 2 MX450s

MoJoPBS
Conversationalist

Having Issues With 2 MX450s

Hello!

Quick rundown:

We have 2 MX450 merakis, one for each of our ISPs. They were in VRRP, we had issues with VRRP and made them separate about 1 month ago. Topology looks like this:

ISPs -> Merakis -> Palos -> Internal Cores

 

In addition, a separate VLAN on the merakis Meraki -> Internal Cores

 

2-3 week ago, we experienced a complete internet outage. We confirmed our ISPs were good, narrowed it down to the MXs, one is on firmware 19.2.2, the other is on 18.211.6 (why? I don't have control of them lol..) Ever since, every Wednesday at 2Pm CST we experience the same outage for about 1 hour. I noticed the Meraki on 19.2.2 is "rebooting" all things connected to it state the interface went down, in addition in it's logs it reports the same including WAN ports but it doesn't say it rebooted lol... I'm very confused. As for WHY both merakis keep going down, I don't know. Any ideas? Advice?

When our outages occur the logs state they lost connection to our internal gateway. In addition the Meraki on 19.2.2 I noticed in the logs soon after everything is online it did a "intrusion detection rules update".

I've read we should disable IDS/IPS.

5 Replies 5
alemabrahao
Kind of a big deal
Kind of a big deal

If both are configured for HA, it doesn't make sense for them to be on different versions.

 

If you don't have access to their management, it would be a good idea to talk to someone who does to evaluate it.

 

I also suggest opening a support case with Meraki.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
MoJoPBS
Conversationalist

Both are now independent, I agree the firmware needs to be the same. We did create a support case and they want us to capture packets during the outage. Appreciate it!

RWelch
Kind of a big deal
Kind of a big deal

If it were me, I'd double check the ports (paths) are run correctly to verify something isn't causing the reboots due to a possible mis-configuration.

I would suggest opening a support case after verifying the ports (paths) to ask if support can see what's going on from the backend, perhaps that will give you a glimpse or insight.  

 

And as @alemabrahao mentioned, it would be best to have both MX devices on the same firmware (if one gets updated the other does as well).  

It might be best if you establish the practice that both MX appliances get updated/upgraded at the same time with whoever has access/control to firmware changes, perhaps they are unaware of the implications or connotations when HA devices aren't running the same firmware.

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
MoJoPBS
Conversationalist

Yes, we're going to be down-grading the firmware so they're both the same. Thanks!

PhilipDAth
Kind of a big deal
Kind of a big deal

You must be doing some kind of route tracking on the Palo's.  Are they reporting they are up/down, and is that test correct?

Get notified when there are additional replies to this discussion.