CW9176I - Roaming Client frequent disconnect

BaskaranGanesan
Getting noticed

CW9176I - Roaming Client frequent disconnect

Observing that the roaming client disconnects frequently and roams from one AP to another AP very often, it creates internet issues. highly 

69 Replies 69
alemabrahao
Kind of a big deal
Kind of a big deal

Do you have 802.11r or 802.11w enabled?

Did you do any site surveys before and after installing the APs? Sometimes it could be a bad wireless design.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
BaskaranGanesan
Getting noticed

802.11.r is eabled and 802.11w is in disable state,  Recently we did site survey, deploy AP 60 Ft one to another device,  here 

 

AP1 ------------> Ap2 -------------> AP3 ------------->AP4

 

The client is near AP2 and keeps bouncing between AP1 and AP3. Good roaming is fine, but we are getting bad roaming. We have done multiple fine-tunings on the roaming issue, which didn't help, mostly on Intel AX211 6E, causing an  issues 

alemabrahao
Kind of a big deal
Kind of a big deal

Try disabling 802.11r, if I'm not mistaken this is a known issue with these Intel cards and disabling 802.11r should solve it.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
RaphaelL
Kind of a big deal
Kind of a big deal

As mentionned by alemabrahao , try disabling 802.11r. What is the security on this SSID ? WPA2-PSK or WPA2-Enterprise ?

BaskaranGanesan
Getting noticed

Tried disabling the 802.11r and tried client balancing disable,  Enable manually RXSOP to -79 dbm, it works for a while, and bouncing issue.   

 

Currently enabled with WPA3 i

Paccers
Building a reputation

Recommend not to play with RxSOP settings unless you really know what you're tweaking for your environment, easy to cause unintended client disconnects with that (hence Meraki's big warning in that section)

BaskaranGanesan
Getting noticed

Working with the Meraki TAC person on RXSOP, but we are still experiencing same issue 

stgonzo
Getting noticed

Are the APs on firmware 31.1.6?  Version 31.1.7 has some fixes for this type of issue, it has now gone to stable release

 

If you look at the connection logs, do you see failed connections because of reason "reserved"

 

stgonzo_0-1746035094535.png

 

PhilipDAth
Kind of a big deal
Kind of a big deal

What does wireless health say?

What does the roaming log for the client say in the Meraki Dashboard?

BaskaranGanesan
Getting noticed

Roaming Reasons (Intel)

"AP recommended the client roam to another AP, as part of AP Steering. 802.11v is the Wireless Network Management standard, allowing clients to be directed for load balancing and improving the performance of poorly connected clients." 

 

Getting above reason on the timeline logs the reason for roaming.

Meraki Health Broad Show good.  

Brash
Kind of a big deal
Kind of a big deal

Sounds like the client is being actively rebalanced by client balancing.

You can trial turning it off in the RF profile. I've had some success disabling it in some networks, particularly dense environments with a lot of AP overlap.

 

Otherwise, if it's only some specific clients impacted, ensure WiFi drivers/firmware is up to date on those clients.

BaskaranGanesan
Getting noticed

Currently, its client balance is shut down. Still, we are experiencing roaming clients disconnecting. tried maximum of work around,  

BaskaranGanesan
Getting noticed

Does anyone have the CW916I model installed in their environment?  

stgonzo
Getting noticed

Do you mean 9176i, I have some and having similar issues

BaskaranGanesan
Getting noticed

yes , i have both 9176i and CW9166I

BaskaranGanesan
Getting noticed

any suggetion from meraki to fix the issue 

stgonzo
Getting noticed

They checked configuration and suggested upgrading to firmware 31.1.7, but this hasn't resolved the issue unfortunately

BaskaranGanesan
Getting noticed

It looks like we are having an issue with the CW product. It tough time 

stgonzo
Getting noticed

It is tough, I think this is a firmware issue causing some problems with 6GHz, I have some Wi-Fi 6 Meraki APs on this network that work fine with no issues, it it the Wi-Fi 6E and Wi-Fi 7 APs that seem to disconnect client devices.  The Wi-Fi 6E devices were on firmware 30.7.1 previously and had the issue then, the low number of those APs didn't impact too many users.  I think I will try to disable the SSID on 6GHz to see how it runs, hopefully a future firmware will resolve the issue

BaskaranGanesan
Getting noticed

Yes, I completely agree with the suggestion. When I disable the 6 GHz, it works fine. 

I have reduced the number of APs to half and can see better performance now, but still, 5% of clients are having bad roaming. Tweak a few settings at the network adaptor AX211: Roaming aggressiveness to medium-low, transmission power to medium-high, and preferred band preference to 5 and 6 GHz. After these changes, it's reduced to 5% of roaming clients.  

 

I got verbal confirmation that it's a bug with Meraki OS.  

n01048
Here to help

Yep i do and having roaming issues for the past 5 months!

Main10ence
Meraki Employee
Meraki Employee

Hello @BaskaranGanesan,

Have you tried calling Cisco Meraki support and troubleshoot live with an engineer? They maybe able to catch something that we are overlooking.

 

If the timestamps of the roaming events matches up with the user experience during that time, I would record the following information:

MAC address of the device

APs involved (and the RF of the APs)

The SSID 

The radio (2.4 or 5GHz)

The time

 

These details help a lot during troubleshooting. 

.ılı.ılı. Cisco Meraki
Network Support Engineer

"The future favors the bold."
BaskaranGanesan
Getting noticed

Hi , 

 

Working with the Meraki TAC person for the last 30 days and escalating the issue to the engineering team, it's not working out so far; we are experiencing disconnections here and there.  

 

TAC - Number 12854457

 

 

Main10ence
Meraki Employee
Meraki Employee

Thanks!

I see now. Your case is a bit more complicated than "bad" configurations. Since the case has been escalated, the best path is to continue working closely with support.

.ılı.ılı. Cisco Meraki
Network Support Engineer

"The future favors the bold."
n01048
Here to help

@BaskaranGanesan we are having the same issue on CW9166I model. Disconnections on 5Ghz and 6Ghz when roaming. Meraki case open as well with no fix as of yet. Without 802.11r or client balancing. We revered back to 30.6 firmware but the issue still exsists even the in 31.1.6. I am going to try upgrading to 31.1.7 to see if this helps at all. Mac or Windows affected.

BaskaranGanesan
Getting noticed

The latest update from our escalation engineer is that the bug fix has been pushed to their lab environment to see the differences. Hopefully, we can see the results by the first week of June.  

Try with RXSOP 

5 GHz to -79 dBm

6 GHz to -82 dBm

I can see the difference somehow but not fully fix

OVERKILL
Building a reputation

Did they give you a version number by chance as to the anticipated release? I see 31.1.7.1 now as stable, but that's old enough that it can't be the version they were testing. 

n01048
Here to help

Same issue here.  Did Meraki confirm its an actual bug?  and did you find if you upgrade to latest firmware any fix? 

BaskaranGanesan
Getting noticed

Yes, I got verbal confirmation. Next week I have planned to upgrade,  let you know once upgraded 

n01048
Here to help

Yes i just spoke to support. I will try the RXSOP today and see if this helps. I am also doing the upgrade tonight

BaskaranGanesan
Getting noticed

I have reduced half of the AP on one floor to see the difference; it looks good.  Kept AP at a distance of 55 Feet in a low-density area, looks better now, but still seeing bad roam between bands 

 

https://documentation.meraki.com/Architectures_and_Best_Practices/Meraki_Wireless_for_Enterprise_Bes...

 

n01048
Here to help

@BaskaranGanesan how are you getting on? i even after RXSOP adjustments i do not see any difference.. I am on the latest firmware now but also that did not make any difference.

BaskaranGanesan
Getting noticed

Did you adjust the AP distance to 55 Feet to another AP? 

 

5 GHz target power = 14 to 20, 

RXSOP = -78

6 GHz target power = 8 to 30

RXSOP = -78 

 

Disabling client balance.  

Updating the network driver to he Windows® 10 and Windows 11* Wi-Fi package drivers 23.130.1 for the Intel® Wi-Fi 7/Wi-Fi 6E/Wi-Fi 6

 

I can see some improvement, but not a complete FIX.  

OVERKILL
Building a reputation

Yeah, I'm noticing this too (have not upgraded to 31.1.7 yet) as I just rolled out 3x CW9172I's to replaced some MR20's. Doesn't seem to be happening on my WPA3 PSK/6Ghz enabled SSID (could just be the reduced range of 6Ghz?), but on my 2.4/5Ghz WPA2 and 802.1X SSID's, it is. 

 

One of the SSID's has 802.11r disabled, another has it enabled, both are experiencing the issue with iPhones and laptops. 

OVERKILL
Building a reputation

Disabling the 6Ghz radio seemed to have considerably improved things, but this is of course not a desirable scenario. 

n01048
Here to help

Oh really that's interesting because we have 6Ghz enabled but only on a specific SSID which majority of users are NOT connected too however the roaming issue occurs on both 5Ghz and 6Ghz SSID's..What firmware version?

OVERKILL
Building a reputation

Yeah, we only had it on a "new" (WPA3) SSID as well, but it changes the overall radio config with it enabled. Disabling it stopped my users complaining and I'm not seeing the same roaming errors. 

 

I was on 31.1.6, and this changed fixed it on that version, and I'm running 31.1.7 now, and this "fix" has persisted. 

BaskaranGanesan
Getting noticed

Last conversation with the escalation engineer confirmed the debug version is running in their Lab.  We can probably expect the new release with FIX in a couple of weeks. 

 

n01048
Here to help

got it, what issue did your users complain off specifically? Wondering whether i should turn off 6Ghz too since manual rxsop settings didnt work

BaskaranGanesan
Getting noticed

It is happening due to bad roaming from one AP to another AP, more than 250 MS. At that particular time, we lost our internet connection. The worst observation is bouncing between the radio, 

If the user is close to 10 to 15 feet, everything is working fine; if the user is near the AP or away from the AP, bad roaming is happening.  

 

The issue is happening for both bands 5 and 6.  

 

 

n01048
Here to help

Thank you @BaskaranGanesan @OVERKILL i am seeing thr same so our users will roam between floors and then get disconnected i.e slack or outlook show disconnected and once they disconnect/reconnect wifi it seems better but not ideal fix constantly having to do this

OVERKILL
Building a reputation

Getting disconnected. This, as @BaskaranGanesan notes, is due to bad (high latency) roams that hang the client. Turning off 6Ghz seems to have stopped that issue for my clients, as the complaints stopped and I'm not seeing them in the logs like I was. 

n01048
Here to help

Il try turning off 6Ghz. Did you adjust any rxsop settings @OVERKILL ? or just turning of 6ghz helped? 

OVERKILL
Building a reputation

Nope, no other changes, just turned off the 6Ghz radio. 

MrCado
Conversationalist

Been battling disconnections from roaming for like 6-7 months. We went from MR53 to CW9176i, trying all different settings and we were still having issues.

 

We managed to achieve seamless roaming with RADIUS + WPA 3 and 802.11r and 802.11w active. 5Ghz and 6Ghz both active as well

 

Turns out these work well on Meraki's side and the most important thing you have to make sure is active is on the endpoints wireless profile (make sure you're connected to the network): 

 

Network and Sharing center -> Click on your Wifi Connection -> Wireless Properties -> Security -> Advanced Settings -> 802.11 Tab -> Enable both options under fast roaming (keep default values)

 

Apart from having to find this info on Intel Forums, this completely changed the roaming behavior for us. You can test by leaving a -t ping to 8.8.8.8 and moving around forcing some roams. (At the moment we have literally 0 timeouts on pings when roaming)


On top of that, before the 31.1.7 version our CW9176i functioning was hectic. From radios "freezing" to DHCP issues solved by restarting the AP's, all sorts of weird issues.

 

We updated to 31.1.7 2 days ago and today i noticed massive improvement in general functioning.

 

Hope this helps someone out there, especially the roaming part

n01048
Here to help

Interesting, i turned off Auto SMPS in the network adapter settings and that seems to have helped so far, still continuing to test. Interesting that with radius its better because that was going to be my next test. What options did you change under 802.11? I have also been having this issue since January with our CW9166I

MrCado
Conversationalist

You mean for roaming?

 

I explained it assuming you manage the endpoints and you're using Windows.

 

1. Make sure 802.11r is enabled in your SSID in Meraki

2. On a managed laptop connect to WIFI

3. On the laptop navigate to:

Network and Sharing center -> Click on your Wifi Connection -> Wireless Properties -> Security -> Advanced Settings -> 802.11 Tab

 

4. Enable both options under Fast Roaming.

 

We do use certificate based authentication. We also send a Wifi Profile via Intune for our Corporate WIFI to the managed devices. Perhaps these settings only show in these cases?

 

for the last months I've tried everything i could find online and other tweaks for my self and this was the only thing that actually made a massive difference.

 

As for other issues, updating to the current latest firmware seems to have done the trick.

BaskaranGanesan
Getting noticed

I have gone through the multiple suggestions, and nothing has worked out. I am still having a roaming client issue between the AP and the radio. 

 

Do any of you guys see the same pattern?

alemabrahao
Kind of a big deal
Kind of a big deal

I suggest you open a support case.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
MrCado
Conversationalist

Not as of lately, Firmware is updated to the max and my setup is as described above and working well on both 5 and 6GHz.

In pings my pattern before fix would be:

 

1 timeout to roam -> 3 timeouts after a few seconds

 

After fix:

 

1 timeout to roam or none what so ever.

 

laptop settingslaptop settings These are the settings i was referring to.

 

Since we use cloud radius we also adjusted the authentication timings in meraki but not sure it would matter much here.

 

And of course if nothing is working for you just open a support case

BaskaranGanesan
Getting noticed

I have tried all the recommended options on a Windows laptop.  It doesn't help out in our environment; it creates both Windows and MAC laptops.   I have been tracking with TAC for the last three months.  Still, troubleshooting is going on. 

JonM18
Here to help

We've had issues with roaming on CW9176 with WPA3 and just got confirmation from Meraki that there is a known issue. Does not seem to affect the CW9172s. I'm not aware of any workarounds other that not using WPA3 on the CW9176. Haven't heard of any ETA on a fix.

n01048
Here to help

What did Meraki advise you? We are using CW966I’s as we have the issue on WPA2 and 3. 

JonM18
Here to help

They said the issue is only seen on firmware 31.1.7, but we've see in on all previous firmwares. 

 

They're asking if we've seen the issue on 31.1.7.1. We're testing that currently.

JGill
Building a reputation

FYI,  We are running MR 31.1.8 seeing the same issues.   No change from that bump in code. 

GIdenJoe
Kind of a big deal
Kind of a big deal

I'm going onsite to a very angry customer in two weeks to have a call with Meraki while capturing OTA while they will look at the backlogs.  Wish me luck.

JonM18
Here to help

We're still seeing the issue on firmware 31.1.7.1. Meraki says its an issue on a "subset" of devices with WPA3 authentication. We're having to turn off WPA3 (and this 6GHz) for any networks with CW9176s.
Also, issue isn't just with roaming. Clients can get disconnected just sitting still.

GIdenJoe
Kind of a big deal
Kind of a big deal

Hey, guys.

I like to add that I too have a deployment at a customer with CW9172I AP's and we see the AP's disconnecting clients when their RSSI is around -60 or lower which is wayyyy to early.

I could stand with my back turned to the AP and do amazing speedtests but in between my client got disconnected several times.  The issue does not appear to have anything to do with roaming but since when you roam you are usually at a weaker signal this just happens even causing roams to fail and clients having to rescan channels to regain.

 

The logs clearly give a reason=reserved and clearly is a crippling bug.

The issue also happens on the FW version that has this issue 'fixed'.

 

I have also logged a case for this but I first have to get through the initial resistance of the first line support that wants to see my 'surveydata' before proceeding..... <frown>

JonM18
Here to help

The issue we're seeing appears as an association failure with reason "reserved", auth_mode="wpa3-psk", radio="2". Easy to find the issue under Assurance->Overview->Clients->Wireless->Association.

GIdenJoe
Kind of a big deal
Kind of a big deal

Yes you can EASILY see that, but reason "reserved" says nothing about what is actually going on.

Even in bad Wi-Fi you are supposed to be able to be connected to an AP at -82 dBm although slowly.  However these AP's seem to boot you off when you are at -62 - -65 already which is weird.

BaskaranGanesan
Getting noticed

I have followed all the best recommendations given in this forum and TAC. Nothing helped me here. Even I have raised a case with Intel directly for the client's frequent disconnection.   It's not working out.   

Client laptops are sending Probe requests, which are 802.11 management frames sent by clients to request information about the capabilities of SSIDs. Access points give acknowledged probe frame requests to the controller for processing.  Based on that, the Client decides to connect to another access point, even if it received a weak signal.  After some time, the client roams to the strongest signal strength.  This process is happening from the client device.   With the help of the TAC team, I have disabled the management frames at the backend.  Even after that, we are still getting the same result.

The Cisco Meraki team has to conduct a deep dive into RRM.  

alemabrahao
Kind of a big deal
Kind of a big deal

I strongly believe that the problem is not with Meraki, but with the devices, after all roaming is still a device decision.

 

Or a problem with your network design.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
BaskaranGanesan
Getting noticed

No, we are good with our design. Have followed the recommendations from Meraki and a third-party Wi-Fi provider.  Can you advise me? The client is sitting between the two APs; it's possible to bounce between them.  Here, Clients are bouncing another AP, which is a bit forward from the client's location. Why is the AP acknowledging the weak signal probe from the client machine?   Then, how does Auto RF optimize the signal/Channel?

OVERKILL
Building a reputation

Could be. I haven't re-enabled 6Ghz on any of mine as things have been stable with it disabled, but am considering doing some testing with it enabled on the weekend to see if the issue is still present with 31.1.7.1. Note that these are CW9172I's. 

JonM18
Here to help

No. We've observed these issues in the simplest of deployments where there should be no roaming occurring for any reason. The "reserved" association failure should not be happening. Meraki has ADMITTED to an issue with the CW9176. The issue is linked with having WPA3 enabled, not with enabling the 6 GHz radio.

OVERKILL
Building a reputation

FWIW, my issue, which seemed to mirror that of the OP but on CW9172I's, was not restricted to WPA3 enabled networks, in fact it was mostly happening on my WPA2 networks. Disabling the 6Ghz radio stopped the issue from occurring, even though 6Ghz was only being used on the WPA3 network. 

 

My only other thought was power consumption, since the switches are MS210-48FP's which are 802.3at, but unless I'm running max power (which I'm not) the power budget should be fine (consumption is 9.8W with 6Ghz disabled) and the AP's are rated to work on 802.3at. 

Eric3
Here to help

Hi,
I've seen some wifi adapter settings set on very aggressive roaming settings by the user.  Because of the fluctuating nature of radiosignals the receiver jumps frequently between APs.
-What band do you use and do you see this behavior on every band?
-Is it the same client or same brand device? Driver issue alert.
-If device is an android or Linux device you need to disable mandatory dhcp.  Those devices doesn't like to be questioned about DHCP with every roaming step.


n01048
Here to help

@BaskaranGanesani have now finally after 7 months of troubleshooting fixed the issue. On Windows if you change the group policy setting under “allow network connectivity during connected-standby (on battery). We tried this fix on our affected windows users and the issue is now resolved. No further issues for around 2 months

BaskaranGanesan
Getting noticed

@n01048 -0. Good to hear from you.   I tried this option five months ago, but it didn't work out for me. Issue with both MAC and Windows laptops.  We are receiving strong interference from a neighbor on the 5 GHz band. Playing with Low Power on the 5 GHz to see the RF interference.  So far, I can see a 10% improvement.  However, I am still having an issue with the client bouncing between the APs.  The problem is that the client is accepting the weak signal and bouncing back to a strong signal.   

Get notified when there are additional replies to this discussion.