MX400 utilization spikes of 100%

EClap5
Getting noticed

MX400 utilization spikes of 100%

I've got a weird issue going on here and I have opened a case with Meraki and so far they are stumped.  I am running the latest beta on 3 MX400s but an only have this issue on one MX.  Since the upgrade to 13.25, this MX utilization will start to creep up towards 100%.  Usually on Saturday night, the utilization hits 100%, the MX reports the WAN uplink has failed (which it hasn't) and throttles my bandwidth from 200M down to around 30M.  I have verified this myself as I came into the office last Sunday to call Support so they could troubleshoot the MX in a failed state.  I usually find this on Monday morning and reboot so everyone is happy again.  The very odd thing is that while on the phone with Support Sunday, the MX rebooted itself almost like it knew it was struggling Smiley Frustrated.  Prior to the firmware upgrade, this MX never got above 25% utilization and another MX with comparable traffic usage/device count doesn't come close to 25% while running this latest beta.  My Meraki networks are air-gapped from production traffic so mostly iPads and iPhones are on the network and almost none are in the building on the weekends when things go haywire.

 

I just spent some time going back through summary reports back to when the 400 was able to show device utilization (June) and am pretty certain it is something in the 13.25 beta build that is not liking this particular MX. 

 

Has anyone else had an issue like this? 

13 Replies 13
PhilipDAth
Kind of a big deal
Kind of a big deal

Are we talking about WAN utilisation?

 

Get support to double check that the version of code running on the MX is actually the beta version selected.  There have been cases where an MX fails to upgrade, and then goes into an infinite upgrade loop, which it never completes.  This makes the MX semi-regularly reboot and loads it up.  Alas the dashboard only reports the version it is meant to be running, not what it is actually running.

 

And is Sunday (or Saturday night) by chance the scheduled time in your network for doing software upgrades?

@PhilipDAthI am talking about device utilization on the MX itself.

 

My upgrade window for this network is Monday's at 2am.  I did not know that about the firmware upgrade loop, I will get back with Support to check that.  I just sent a reply back to them about this never happening until the latest beta which I am only running because a beta last year fixed an issue with content filtering denying access to credible/safe domains. 

 

Maybe the upgrade loop is what is happening.  The MX keeps trying to update and thus gets pushed to 100% utilization.  Also, maybe the MX showing a speedtest of 30M on a 200M circuit is because the poor thing is just sputtering along until it finally reboots itself. 

 

Thanks for the info!

Adam
Kind of a big deal

What setting/screen are you looking at to see "device utilization"?

Adam R MS | CISSP, CISM, VCP, MCITP, CCNP, ITILv3, CMNO
If this was helpful click the Kudo button below
If my reply solved your issue, please mark it as a solution.
EClap5
Getting noticed

Device utilization is on the Summary Report page although I only see that measurement for the MX400s. I believe it was introduced in firmware back in June since I went back this thru the year to see if I ever had these spikes before. 

Adam
Kind of a big deal

Would you mind posting a screenshot of that device utilization?

 

And strange on the utilization.  In your environment are all of the MX400's setup in a load balanced type design or could that devices be legitimately busier than the others.  

Adam R MS | CISSP, CISM, VCP, MCITP, CCNP, ITILv3, CMNO
If this was helpful click the Kudo button below
If my reply solved your issue, please mark it as a solution.
EClap5
Getting noticed

For this network there have been no major changes in usage or number of clients.  The only major change is the firmware which was updated 3 days before all of these spikes started happening.  I don't have any sort of load balancing and device util this year has been under 25%.  This is my busiest 400 of the 3.  Here is a screenshot of the utilization covering last Friday thru Sunday.  The drop off is when the MX decided to reboot itself.

Screen Shot 2017-12-01 at 11.05.36 AM.png

 

I can't see the Device Utilization on my MX400.  Do I need to tweak somewhere to make it shows on the dashboard.

@Tung_NguyenWhat firmware version are you running? I am running 13.25 and did not have to enable anything to get the utilization graph to show.  I had to send Support a screenshot of the graph because they didn't know what I was referring to when I referenced the MX utilization.  Maybe it hasn't made its way to the stable release yet.

Thanks for quick respond. I'm running version 12.24.  Was version 13.25 beta version?  Time for me to update firmware 🙂

Adam
Kind of a big deal

Latest stable is 12.26.  Latest beta is 13.27.  As of 12/6/2017. 

Adam R MS | CISSP, CISM, VCP, MCITP, CCNP, ITILv3, CMNO
If this was helpful click the Kudo button below
If my reply solved your issue, please mark it as a solution.

Thanks Adam!

@Tung_NguyenCorrect, 13.25 is beta.  Originally I went to beta on my MX400s to resolve an issue with some erroneous content filtering and have been on betas ever since.  The 13.27 beta resolves an issue with a rare case of MX400/600s rebooting, which is what my 400 has done, but Support says my case is not the one identified in 13.27.  So far my MX issue has not reappeared since 11/27 and has seemed to have 'broken the cycle'.

 

Screen Shot 2017-12-06 at 3.18.06 PM.png

I have so much content filtering issues with MX100 and MX400.  Sometime it blocked google and gmail.  We are using G-Suite for our email system.  Good to know with newer version fix content filter.  Thanks for your information

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels