MX - Device utilization degradation after upgrading to MX 18.2

RaphaelL
Kind of a big deal
Kind of a big deal

MX - Device utilization degradation after upgrading to MX 18.2

Hi ,

 

We are currently upgrading from MX 18.107 to MX 18.211.3.  Upgrade date is sept 26th and devices are MX68CW.

 

After upgrading 500 networks I noticed a general trend. Device utilization is going up in 80% of our networks. 

Some are more obvious :

 

RaphaelL_0-1727441090158.png

RaphaelL_1-1727441099381.png

 

 

RaphaelL_2-1727441132640.png

RaphaelL_3-1727441144555.png

 

Anyone else encountered this ? I was expecting a better device utilization not worse 😐

 

 

24 Replies 24
SeanW
Here to help

Not related to this issue, but I noticed my test units kept dropping the WAN connection every 24/36 hours. After a downgrade everything has cleared up.

cmr
Kind of a big deal
Kind of a big deal

Yes, I saw this and 19.1 seems better, so I have been running that where I can 'risk' a beta.

RaphaelL
Kind of a big deal
Kind of a big deal

oof... I hope I won't encounter major issues. I don't wanna halt my deployement because of that.

henrry81
Here to help

I have seen similar problems and it is surprising that cmr found 19.1 to be better. If you’re okay with getting into beta testing, you might want to give that version a shot.

RaphaelL
Kind of a big deal
Kind of a big deal

Little update : This increase of device utilization seems to be caused by the IDS/IPS module. We are running in IDS Security mode. 

 

More to come.

cmr
Kind of a big deal
Kind of a big deal

I upgraded an MX75 running the IDS in security mode from 18.1xx to 19.1 in June and as you can see below the usage didn't really change:

 

1000012774.jpg

 

Though looking at it again it might have increased very slightly... I do however remember the increase with 18.2xx being quite marked.

cmr
Kind of a big deal
Kind of a big deal

Why not try one site with 19.1, it's been very stable for me since June, which is more than can be said for the MR beta releases... 😬

RaphaelL
Kind of a big deal
Kind of a big deal

Can't do , we do not run beta firmware in production 😔

@RaphaelL I have the same issue in one HA pair MX95

RaphaelL
Kind of a big deal
Kind of a big deal

I suggest you open a case to get more visibility on that issue !

NFL0NR
Building a reputation

just out of curiosity you might want to call in and check the number of flows going to the device we were having a similar issue and turned out we were flooding the 450 with DNS traffic.  We offloaded the traffic and our numbers are looking MUCH better

RaphaelL
Kind of a big deal
Kind of a big deal

Even a MX85 is struggling since we upgraded it this week : 

 

RaphaelL_0-1728575280073.png

 

 

Support has suggested to go to MX 19.1.4 since some fixes are present in that version that are not yet ported to MX18.2 . Not a fan of running beta firmware while 'stable' firmware is not even stable but I will give it a try.

Yes. I confirm an HA pair MX105 is struggling to stay below 50% and it is hardly doing anything

 

Screenshot 2024-10-10 121401.png

RaphaelL
Kind of a big deal
Kind of a big deal

When did you upgrade yours ? 

about one month ago.

RaphaelL
Kind of a big deal
Kind of a big deal

So you had already problems prior to your upgrade... looking at the utilization around aug 13th

before that they recommended a reboot and I did it. The problem was solved. The after that I opened a case and they said this:

 

I apologize for the issues you've been experiencing with the high utilization of your MX.

 

---
After investigating, I found that there is a known backend issue causing high CPU utilization following the upgrade to firmware version 18.211.3. As a workaround, we would need to disable the backend feature internally on your MX and downgrade it to version 18.211.2, which is the current stable release.

I recommend reaching out to our support team at your convenience by calling +1 415-937-6671. We’re available 24/7, and any engineer will be able to assist you with this issue.

----

I never called them.  The whole backend issues sound fishy to me

 

RaphaelL
Kind of a big deal
Kind of a big deal

Would be curious to know what backend feature they are refering to.

cmr
Kind of a big deal
Kind of a big deal

Wow, that is some utilisation!  I haven't yet upgraded to 19.1.4, but 19.1.3 has been stable for months for me.

Just a question, but is not rolling back to 18.1 not an option?

RaphaelL
Kind of a big deal
Kind of a big deal

It is an option. But I want to get to the bottom of that story before downgrading since Meraki doesn't seem to know at all what is causing that.

I called them again but they still don't know the trigger. The workaround is the downgrade. They referred as well once again to the backend feature.  Really odd! Meraki support is good when you need their assistance with basic things: configs, licenses, packet captures and so on. When the issues are bugs you can never get a clear explanation.

That's most companies :).

JGill
Building a reputation

We saw the issue with MX 85 / 95 units on 18.107.2.  MX's running 95% utilization or better!  18.107.3 seemed to correct the issue.  

 

Learned to look at this python script /  API call after upgrades after that! 

 

import meraki
import os
import pprint

API_KEY = os.environ.get('MERAKI_API')
organization_id = os.environ.get('MERAKI_ORG')

dashboard = meraki.DashboardAPI(API_KEY, caller='TopApplianceUtil.py/v1.0')

response = dashboard.organizations.getOrganizationSummaryTopAppliancesByUtilization(organization_id, quantity= 500, timespan = 2000)

print ('lengnth = ' , len(response))
for mx in response:
   #print(mx)
   print(mx['utilization'], mx['model'], mx['name'])
Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels