High Latency on APs - bouncing the AP seems to resolve the issue

Kleverett
New here

High Latency on APs - bouncing the AP seems to resolve the issue

Hi all,

 

We currently have a case open with Meraki going on 6 days now. At multiple sites we've been seeing sporadic issues with our clients passing traffic through access points. When we look at the AP in question we are seeing high latency from it (100ms+). We bounce the AP and everything is seemingly good from there. Wanted to post our situation here to see if anyone else is having this experience as we wait for support to take a look at our latest packet capture.

 

Thanks!

26 Replies 26
Ryan_Miles
Meraki Employee
Meraki Employee

What model(s) AP?

What firmware?

Happening on all SSIDs or just some?

Happening to all clients or just some?

Ryan

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
Kleverett
New here

Hi there.

 

MR45/46 

Firmware MR 28.5
 
I have yet to confirm that it is happening on all SSIDs as our staff currently only has access to 1 SSID, but you raise a good point. We are starting to look into our radio settings now.
 
One Hypothesis right now is potentially looking at changing the Minimum bitrate configuration to Per Band as we have 2 sites in our organization that have not reported issues and their minimum bitrate is set per band while the sites having issues are configured per SSID
 
Thank you
PhilipDAth
Kind of a big deal
Kind of a big deal

I have seen this caused by Windows power saving.  On a Windows 10 machine, try this (which disables power saving for the WiFi NIC):

powercfg /SETDCVALUEINDEX SCHEME_CURRENT 19cbb8fa-5279-450e-9fac-8a3d5fedd0c1 12bbebe6-58d6-4636-95bb-3217ef867c1a 0
powercfg /SETACVALUEINDEX SCHEME_CURRENT 19cbb8fa-5279-450e-9fac-8a3d5fedd0c1 12bbebe6-58d6-4636-95bb-3217ef867c1a 0

 

Another common cause is WiFi driver bugs.  Try checking the manufacturer of the WiFi chipset for driver updates.  Note that the OEM (Lenovo, Dell, etc) often don't publish the current version of the driver.  Hence the need to check the WiFi manufacturers web site.

 

I hope this is on 5Ghz.  2.4Ghz is very problematic and issues are common.  Can you disable 2.4Ghz on the WiFi to get rid of this problem area?

YC2
Here to help

Running into something similar. Clients on the ap can associate and authenticate just fine but user traffic has such high latency its essentially unusable. If your only seeing 100ms your lucky... ours reaches 4 digits. Reboot and it's peachy until the next ap over, or the one on other side of building, does the same thing. Reboot, rinse, repeat.

 

Seems to have started a few months back and progressively getting worse. Happens across multiple platforms. Worked with support earlier after getting another example we could replicate. Captures indicated simple pings weren't making it to wired side of AP. Same client can roam to another ap and work just fine with no other changes other then walking away from the culprit AP.

 

Auto upgrade to 28.6 (from 28.5) was set to happen in a week. Forced it for overnight. Time will tell.

WB
Building a reputation

There's been some chat around a bug fix noted in the 28.6 release notes:

 

'MR could enter a state where ADDBA requests would not be initiated, resulting in reduced throughput (Wi-Fi 5 Wave 2 MRs)'

 

@YC2would be good to know if 28.6 provides a permanent fix for you!

Agus
Getting noticed

The problem happened to me to it happened with device using 2,4Ghz device, when i saw on dashboard most of client coonect/receive signal with 2,4Ghz a little device connect.receive using 5Ghz. I think 2,4Ghz already saturated even maybe on one of user still on download progress on there, i thnk that make high latency.  

YC2
Here to help

While that's a possibility that's not what was happening here. My two test clients were on the same ap, with no other clients on it, on 5ghz.

 

After fighting the automatic firmware update scheduler all ap's at the site are on 28.6 now. Time will tell.

Jai007
Conversationalist

I am curious if the upgrade to 28.6 solved this issue for you?

YC2
Here to help

Unfortunately no, not completely. Maybe made it less often. Haven't heard anything since the upgrade in March. Then in the last week or two I got another report. This time we haven't been able to replicate to allow for troubleshooting. The users tend to report the issue, then move/roam, powercycle, etc until it starts working, leaving us stranded on the troubleshooting front.

 

I'll take every few months vs every few weeks though.

Jai007
Conversationalist

For us this seems to be related to the recent upgrade from 27.5.1 to 28.5.  So we are going to try rolling back to the previous version and see what the results are.

YC2
Here to help

Got reports of this again today. Tech on site was able to replicate.  Got meraki on the line and went at it for a long while. Our entire Meraki infrastructure is all L2. SVI's, routing, etc happens on an upstream device.

 

The client had heavy packet loss to it's svi, and on packets that did get a response time was through the roof, 500+ms to 4 digit range. Capturing on ap's wifi interface didn't show anything out of ordinary, infact it showed all normal times and no drops. Capturing on wired port and suddenly things started working normally again.

 

Makes me think either packet didn't make it to AP, or AP dropped it without notice. 

 

Coincidentally - the ap client was connected to was on opposite side of building, down a floor. Auto power has it so high though that the client hears it louder then the units much closer (and on the same floor!). I don't believe this has to do with the issue at hand though because client can pass traffic through the further ap just fine when not in this failure mode. 

thomasthomsen
Kind of a big deal

There is a known bug with MR46 where it, overtime, does something(tm) that causes low throughput.

Reload, or a "shut / no shut" of the SSID fixes the problem.

This issue should be fixed in 29.2 (and only 29 software as far as I have heard, since it requires a Radio Firmware upgrade).

Perhaps that is what you are running into.

YC2
Here to help

For all clients on the same ssid? Nah... this is unique to individual clients... other clients on same ap and same ssid work just peachy.

 

Had a second incident reported a couple days after the one mentioned above. This one was 0 response from it's svi but again other clients on same ap were fine. This client, roaming to a neighboring ap was ok, then back to the problem ap failed. After packet capturing... the arp request client makes to the svi never received a response. Packet capturing on the switchport the ap is connected to though shows the request AND response. It appears the AP never transmits the response back to the client. The was receiving other random broadcast/multicast traffic from the AP though so it doesn't seem to be an rf issue. The roam/dot1x/dhcp/etc worked fine.....

Navidg
Comes here often

Have you found any fix yet?

 

thomasthomsen
Kind of a big deal

Interesting. I have seen something similar I think. But it seems to be kind of a "random" problem, and to us/me it seemed to only affect certain "older" clients, and it always seems to happen when the client roams (again like you the roam , dot1x and so on seems fine, but from the client perspective the network does not forward traffic). Im really looking forward to test the 29.2 software, just to see if anything changes. 

Navidg
Comes here often

Have you found any fix yet?

Speedbird1
Getting noticed

 

We have a very similar or exact issue on MR56's 28.6.1. AP's all slowed down no client could get more than 10mbps speed pings were  >100 ms. 

On a normal day we get 200mbps throughput and pings <8ms.

 

Full reboot of the site AP's resolved it. 

 

The known issues on 29 firmware don't look promising either 

 

  • AP performance may degrade over time (Wi-Fi 6 APs)

 

WB
Building a reputation

Release notes for 29.2 have just come out listing 'AP performance may degrade over time (Wi-Fi 6 APs)' in the Bug Fixes section. Is not included in 28.7 however!

Speedbird1
Getting noticed

Ah  yes thanks ... Just seen that 

YC2
Here to help

DO NOT TOUCH 29.2 with a 10 ft pole. Maybe 20ft. Make it 20km.

 

Tried it over the weekend. Was a disaster.... ap's were rebooting every 10-30 min. Apparently 29.2 has an issue with radius that does this. Reverted back to 28.6. Support's indicated this happens after approximately 2 months of uptime. We'll be on a 1 month reboot schedule.. 

Jai007
Conversationalist

What MR model do you have? We have been running 29.2 for 3-4 days on MR55 and haven't had any rebooting issues so far.  Using Cisco ISE Radius.

YC2
Here to help

I wish you the best of luck then. These were all 45's, ISE 3.0.

YC2
Here to help

Circling back for completeness sake. We were passing the group policy name from ISE to Meraki using the Airspace-ACL-Name radius attribute (and have been for years). This will make the connection fail and also reboot the ap every 10-ish minutes if the AP's on 29.2 Beta. If you use Filter-ID instead it lets you connect and the AP will not spontaneously reboot. 

 

Afraid of what else 29.2 will reveal. 

WB
Building a reputation

Wow, good catch! It sounds almost like that software is still in beta or something!?

YC2
Here to help

I want to take the credit but unfortunately I can't. Support alerted me to the bug. The frustrating part isn't that there's a bug in beta firmware. The aggravation comes from a bug that's known to support but not in the release notes, on top of the fact that this fw was a suggested alternative to fix a different bug.

Jai007
Conversationalist

Thanks for posting all this, really helpful and interesting bug.  Don't get me started on the transparency of these things from support!  I also live in fear of the next release, so far 29.2 in an office of about a dozen MR56 APs is working ok.  Only 2 weeks in though so often these things take time to rumble to the surface.  I'll post here with anything I find though.

Get notified when there are additional replies to this discussion.