Bringing down MX68 Utilization and AMP

RumorConsumer
Head in the Cloud

Bringing down MX68 Utilization and AMP

Hey all

 

I have a MX68 in an environment with a bunch of devices - average 130 clients a day. Half of these devices are LAN only ie NAS's, switches, IoT sensors, point to point bridge units. For those who know me, it's a retreat center. Recently, we've had a few larger groups come and the device has seemingly hard crashed. It has happened once every 2-3 weeks and requires a hard reset. Meraki support says the device is spec'd for 50 clients (insane for the money and I dont mind saying it out loud) and that it has been spiking high in utilization/CPU usage, like pegged high. I have nothing fancy going on.. a service VLAN and regular client VLAN, but not passing any data between them. Everybody on my network is texting, YouTubing, browsing the web and using web apps etc. Consumer usage. Ok so it is what it is. Support's suggestion was that if I didn't need one or both of the Advanced Security Package options to disable them, as the deeper packet inspection would raise utilization. We are an Apple/iOS/macOS house almost exclusively and when guests come honestly if they have PCs and don't have their own virus protection that's on them, so I decided to disable AMP. Looking at the event log in Security Center I have no confirmed positives, all clean. Looking at the utilization graph things do seem to have calmed down significantly but I havent yet been able to test it with a high influx of guests. If this continues, this will force me off of Meraki MX solutions and into something like Fortigate or Firewalla. 

 

Any other thoughts or suggestions from this incredible support community welcome. 

 

Here's the graph for the last few days. I turned AMP off on the 2nd of August. This is with standard client load of 130, with half of those being actual devices as I stated above.

Screenshot 2023-08-05 at 9.51.52 AM.png

 

 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
46 Replies 46
alemabrahao
Kind of a big deal
Kind of a big deal

What is the problem? The maximum capacity is specified in the product datasheet.

You should have paid attention to this information.

Anyway, you can use the firewall from the manufacturer that suits you best.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.

So helpful thank you so much 😂

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

It's just sincerity, if you are not satisfied with Meraki change vendor. 😉

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.

It's easy to pop on a chat board with somebody trying to work out how to fit into their hardware and tell them to put up or shut up. I think the thing you aren't acknowledging is that I'm already quite invested in Meraki, I have an excellent ranking on this forum, so clearly I love the gear. It's not like Im just here to complain. If you are such a big deal yourself you know that there is a lot that goes into how much work the MX does moment to moment, client to client. So instead of snarky flexing, maybe consider using all your knowledge to actually assist and offer helpful information about how to squeeze as much capacity as I can out of what I have, as others on this thread are doing. Just depends on who you want to be.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

Eloquently put @RumorConsumer . 

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.

I don't think I'm better than anyone in this community, but you yourself said that you're thinking of replacing it with another vendor.

 

Meraki is not perfect (in fact there is no such thing as a perfect vendor). In your case, it is clear that this model no longer serves, so the ideal is to upgrade to a larger model.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.

I said I was thinking about it, but as you can see, there are plenty of valid suggestions for how to bring down utilization that are absolutely applicable to my use case as described. It works great most of the time, so I would not say it's close to clear that it doesn't work. In fact the most knowledgable member of the forum is on here saying the issues Im facing dont make sense to him. Don't worry about it. Thanks for coming back. 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

Do you think disabling AMP is valid? I mean, do you think it's worth the risk despite not showing up at the events?

 

I think security is never too much, so I don't know if it's a valid option.

 

But it does make sense, if you have more simultaneous clients than the device supports, you are expecting high processing, I have seen this in practice.

 

Another thing I can say is the 18.x version has some problems and Meraki doesn't seem to be taking any action to solve them.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.

I do think disabling AMP is absolutely valid. My users are 99% Apple and they are savvy enough to know not to install something strange from the internet. Im not worried about them and if PC users are on my network without virus protection thats a serious issue that I am not here to counterbalance. So yeah, I dont care at all about AMP. It is literally malware scanning, right? I still have the advanced intrusion protection on. High processing is fine, but the thing has been running for 2 years without failure. 
Good to know about the firmware. Maybe Ill see about rolling ti back.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
DarrenOC
Kind of a big deal
Kind of a big deal

Hi @RumorConsumer , so it looks like turning off AMP helped a little?  Has the device frozen/bricked since then?

 

Do you have anything internal on the network that you would class as the “Crown Jewels” that needs protecting?  If not, then having AMP disabled is a small price to pay as this may alleviate the matter short term.  Mid to long term is the retreat likely to see the same qty of visitors?  If so, could be time to upgrade the devices.  Vendor choice is of course down to the retreats budget.

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.

Yeah I would. A couple Synology NAS units that aren't accessible from outside the network at all and run DSM which is basically Linux. AMP seems to be mainly pointed at protecting Windows clients which I could care less about doing. The retreat will probably fluctuate to the 200s. It's so weird because it had been fine for so long. 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
ww
Kind of a big deal
Kind of a big deal

Do you use any per client speed limit?

 

For guest i wouldnt even use amp or ips, but just some kind of L2 isolation  like the mr supports

RumorConsumer
Head in the Cloud

Yes, I do. 250/250 fiber line with 120 limit to all devices.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
ww
Kind of a big deal
Kind of a big deal

120 each device? I wouldnt give a guest more then 8Mbps each.  Most high device utilization i see is related to throughput

 

RumorConsumer
Head in the Cloud

great to know. I will review that and ramp down guest network allowances.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

I've had the same issue at some of my sites, although not as bad. 

Definitely bring the per-client limit down and disable AMP if you truly don't need it. Also, for IDP/IPS, I've set mine at Prevention/Balanced and this seems to be a sweet spot for the smaller MX devices. I would also look at limiting any L7 FW rules and content filtering if you can. Any load you can remove from the smaller MX will bring the utilization down. 

Excellent thank you very much. I have the same settings. 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

And as far as the isolation idea - yes I have them on a Meraki DHCP guest WLAN. Works great. I have about 25 people who are there year round who need to be on a standard LAN so those are the ones who are mild users with the relatively high speed limit. I will take them all down 1/3 and see what happens.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
PhilipDAth
Kind of a big deal
Kind of a big deal

If it runs for 2 or 3 days and then crashes - that sounds like a resource leak.

 

Are you running a stable or better firmware release?

Hey man thanks for writing as usual. You've a mensch. Yes, that was what I was thinking as well. Generally, when the issue is "happening", It lasts for about 2-3 weeks then crashes, then is fine for another period. Running stable only. 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
stgonzo
Getting noticed

I have an MX85 that seems to crash every 2-3 weeks, utilisation is okay and it rarely hits 50%, support advised that there is a known issue with Snort on some devices and my options are to disable AMP completely, or they could put the device on legacy snort while waiting for a permanent fix.  The customer decided to go with the legacy snort options so we will see how that goes.  This has happened on at least 2 versions of firmware, currently it is on 18.107.2.

 

Apparently, the reason for the crash is Snort is constantly generating logs which fill up the MX memory and cause the device to crash, a hard reset will wipe the logs and the devices will be okay until the logs are filled up again.

 

This a bit different to your case as they are different MXs and I'm not sure how much the number of users has any affect on your device, but it would be interesting to know if disabling AMP has fully resolved your issue.

 

 

 

Wow. What a Gonzo problem! This actually makes a lot of sense. My device would NEVER hit high levels of utilization in the past even with 200 clients. Literally never had this issue so it's been very strange to try and troubleshoot it. This feels like it might be it. So far so good with AMP off, and my utilization is back at a very low hum. I'll keep an eye on it but this is so so good to know. @PhilipDAth does this resonate?

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

It's possible - it is a resource being utilised.

I spoke to support about this and whether I was affected and they said looking at my logs I was not affected, but was running the affected firmware. So.. we'll see.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

My graph with AMP turned off on Aug 2, and wifi speed turned down from 120mbit limit to 80mbit limit since yesterday.

 

Screenshot 2023-08-07 at 11.25.48 AM.png

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

idling along quite nicely then @RumorConsumer 

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.

Indeed. I will have as large a group as I'll ever have coming in the next few weeks. I'll see how we fare. Thanks to everybody for your help here.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

Just as long as you're not pushing it constantly above the 80+ % utilization mark you're good 😉

 

As @PhilipDAth states though if it happens again in a cyclical manner you could have a resource leak.

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.

Yeah it's been well below. I'll keep you all informed. 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

Here we are having turned off AMP on the 2nd. I think that was likely the culprit...Screenshot 2023-08-08 at 11.29.30 AM.png

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

A spike! No idea why.Screenshot 2023-08-10 at 6.34.15 AM.png

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

Have you checked the event log for this period? It may have been an IDS rules update.

I had not. Very interesting. Meraki says not available for MX68 at least not yet

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
cmr
Kind of a big deal
Kind of a big deal

17.10.9 now also has the IDS/IPS CPU overload fix.  Hopefully you can see that?

RumorConsumer
Head in the Cloud

Im on firmware v 18.107.2 which is stable and I see the 18.107.4 that has this fix. Lets find out. 

@PhilipDAth 

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

Ok so my bandwidth is 250x250 AT&T Fiber and the max advanced security throughput for the MX68 is 300mbit. I noticed that one of my servers was doing some heavy duty video syncing to another site of mine (normal) but that it perfectly correlated to the spikes I was seeing this week. As an experiment, I used the bandwidth limiter on the server to throttle it down to 20,000kps. The results speak for themselves. 20MB/sec versus 30MB/sec but the utilization is WAY lower. It seems the increase in utilization is logarithmic as you get closer to capacity which I suppose may make sense. Screenshot 2023-08-17 at 11.18.46 PM.png

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

Well done for figuring it out!

I can definitely correlate the high utilization to these file transfers. It's possible thats what it's been all along but Im not sure this account for all of it. Maybe. Definitely I was able to affect the utilization score this time around. I wish there was an alarm I could trigger at certain utilization thresholds so I could investigate.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
PhilipDAth
Kind of a big deal
Kind of a big deal

RumorConsumer
Head in the Cloud

Stopped working again today. Lights are flashing on ethernet ports but no WAN traffic. I can pull up the page of the MX in a browser and it just thinks it isnt connected to the internet. It says "there is a problem". I hard reset it, and it's fine again. Had 10 or so people on a zoom call when it went down. I have currently 156 clients total. Many do nothing but sit there and blink here and there, very little data passing. Maybe average of 20-30mbit on a 250/250 AT&T biz fiber line. Firmware updated to MX 18.107.5, which has the AMP fix and AMP is off. Utilization during this time was under 30%. Support tells me on the phone Im likely pushing it too far with the client count. I cannot believe this. I know theres a sizing guide but Ive been doing this for years with no problem. I dont use any advanced features. I have 1 VLAN for switches (which is about 15 of the clients), 1 for clients, and I speed limit everything to like 80mbit burst at this point. The only thing thats changed is the firmware and the age of the device. I dont use VPN or really any advanced features. Any other thoughts? Support is going to RMA it.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

Got a downgrade last week to 16.16.9 after the replacement unit suddenly decided to stop feeding data to the two POE+ ports. The rep said this was the last firmware he considered stable. Wow. So far so good.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.

The two APs on the POE+ ports winked out again. Everything else seems normal. I decided to say screw it and put those APs on ports 5 and 6 on the MX68 with injectors. Just bypass the built in POE+ ports altogether. I had seen this before on the other hardware I had and I figure it must be something related to the firmware if it's happening on both units. Hopefully this will be the end of it.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

So far the downgrade to 16.16.9 has been a dream. I had a bunch of people here on property and the MX didn't even sneeze which is whats supposed to happen. I got up to 245 clients.Screenshot 2023-09-25 at 3.23.49 PM.png

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

Update - This continues to be rock solid stable. Meraki was out of their minds telling me the later firmware was ok and it was my fault for overloading the device. The rolled back firmware is chefs kiss perfect.

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
RumorConsumer
Head in the Cloud

Update again 7 months later - flawless operation with the same firmware I downgraded to. Works perfectly not a single hitch. Totally a firmware thing. 16.16.9

Networking geek since high school where I got half of a CCNA. Played Marathon II and Infinity over localtalk.
Made many a network over the years, now de facto admin of a retreat center with some of this fine Meraki hardware.
Fortune 100 Tech veteran/refugee.
Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels