MR46 / MR55 WIFI 6 AP's disconnect from dashboard

JGill
Building a reputation

MR46 / MR55 WIFI 6 AP's disconnect from dashboard

Anyone else seeing MR46 AP's dropping / loosing connection from Dashboard when running 28.6 firmware? 

We see 20 - 30 devices drop across the country, different networks / geographic locations usually all at the same time. 

 

Looks like a back end dashboard issue, but trying to get past "its in engineering" is getting old.   

 

I'm looking for other Meraki Customers with similar experiences so we can compare AP version / Firmware versions and get some visibility.

 

The AP's do continue to pass traffic and are online.  So not an outage issue, but a PITA is you manage 3,500+ access points. 

 

If you reboot the AP's they come back online in dashboard, of course taking the AP out of service in doing so.  Or if you wait for a DHCP 1/2 life checking point (24 hour leases, renew at 12 hours) they will check back into Dashboard and go green.   

 

Just one of many issues, but posting his in the AP group to see what others are experiencing. 

 

If there Meraki Product owners for MR, MS, MX or Dashboard components I'd love to have a conversation.

 

    

 

47 Replies 47
LilRhody
Here to help

Just replying back to your thread since you posted helpful info to my question. Our MR46 AP's began dropping as soon as we pushed out 28.6, prior to that they were rock solid and pretty much never went down unless power went down.

JGill
Building a reputation

Same here,   Trying to find out if we should rollback until they get this corrected, but don't want to risk dropping any AP's on a rollback if there is a solution coming soon.  So far no guidance either way. Roll back and expose security issues,  stay current and drive network team crazy. ☹️

MiguelMVLA
Here to help

Also use MR55s, as well as MR56, MR84 and MR86 on 28.6. For us the AP drops started after updating our MS390 firmware to 15.14 on 4/29/22.

 

Can confirm it’s not a cosmetic issue, the APs stop responding to pings and clients cannot connect.

 

Power cycling the APs switchport restores functionality and I don’t believe I’ve had to restart the same unit twice but still frustrating.

 

This morning I had 9 APs down:

DED84C33-9B14-4EEB-98D2-DE12B43C1319.jpeg

 

Also considered upgrading to 28.6.1 but it doesn’t seem to address the issue.

Miguel
EJN
A model citizen

Similar experience. I'm on 28.6.1 now. With no pattern at all, here and there I have APs that loose the connection. Before, they would regain fairly quick... 3-5 minutes. Now, occasionally, they will stay offline unless I power cycle the port in the switch. Some weeks it happens to 1-5 APs, some weeks like this past one, none.

Esteban J Nunez
School and Church
K-12 Education
EJN
A model citizen

3 APs disconnected earlier today. For some reason they don't come back online automatically. If I cycle the port the reconnect. If I wait long enough, eventually they may come online. There is no pattern to this, which frustrates me. Firmware 28.6.1.

 

Today it's these 3. Tomorrow most likely none. Next week it will be another 2-4 and so on. Different switches, different APs.

 

These 3 are on the same switch. But there are other APs on the same switch and they didn't disconnect.

 

$%#&

 

Screenshot 2022-05-18 170523.jpg

Esteban J Nunez
School and Church
K-12 Education
JGill
Building a reputation

They come back when they hit their DHCP 1/2 life renewal.  Why? assuming they check back in with a potential IP address change.  So depending on when they drop, and what your renewal window is they could be down for some time.   Its defiantly tied to the NeXT Tunnel issues.  Got pretty ugly on our side last night,  sounds like they were making changes.   In our case we have done enough packet captures off of switches to know they are still passing traffic and serving clients.  We can ping them and they reply.  We can see them talking to the Meraki Cloud servers, just show down in the dashboard.  

 

PITA as we still need to verify every AP.   

 

Waiting on them to put the Dashboard and MR teams into a room and get this figured out.  At least let us watch the cage fight!  

Speedbird1
Getting noticed

FWIW 

Interesting investigation

 

I dropped our DHCP lease time on some sites to 4 hours, Not seen these sites go offline in the dashboard for over a week now, while other sites that are still on 3 days + do go offline randomly.

 

 

JHern
Here to help

Updates at bottom of posting.

 

------

 

There is definitely a known issue with some WAPs running 28.X code dropping / losing connection with the Meraki cloud dashboard. A change was made between MR 27.x and 28.x code on how the WAPs communicate to the Cisco Meraki Cloud to support FIPS compliance, and that appears to be the likely root cause. See FIPS links at bottom. If you are seeing this issue and have not opened a case, it's probably time to do so. My case number is 08050752 if you'd like to reference it.

 

I have a pair of MR56 WAPs, one running 28.6.1 and the other running 27.7.1. The historical device data for connectivity to the cloud (below) speaks for itself. The Support Engineer I am working with says I am seeing it a lot worse than most folks.

 

20220518 Screenshot.png

 

FIPS compliance Information:

 

Lastly, a similar change also appears to have been made to the MS 15.1+ code and to the MX 16.4+ code as well. I have no idea if these new code levels will have similar problems, but a little extra caution is warranted for now.

 

-------

 

UPDATE from Meraki Support: "I have narrowed down your issue to only the MR56s or Wifi 6 APs for now. The switches typically do not show this behavior and it is not expected you will experience this for the switches."

JHern
Here to help

Update with SOLUTION to my issue.

 

The issue I have been seeing is separate and distinct from the issue of WAPs disconnecting from the cloud occasionally. If you are seeing the connect and disconnect happen every 3-10 minutes over and over and over, read on.

 

Actual problem found was that our new Internet link (about 3 weeks old) was misconfigured for an MTU of 1496. (Should have been 1500). As soon as the MTU was corrected, connectivity from the MR56s running 28.6.1 to the Meraki cloud became stable. 

 

Lessons learned:

  • Meraki MR WAPs with 28.X code seem to REQUIRE an MTU of 1500 on the path between the WAP and the Meraki cloud. (I have asked if the WAPs are capable of doing Path MTU Discovery.)
  • Check your MTU of your network, especially your Internet connection. 
    • The Ping command is handy for this since you can set the do not fragment (DF) flag to enabled and have a user-settable payload size.

 

I hope this saves someone else a few hours of aggravation.

 

One more thing: An Engineer from another firewall company took packet captures at the firewall upstream from the WAPs and before the Internet link where the issue was. He noticed that we had a lot of TCP packet retransmissions on larger packets with the DF (don't fragment) bit set. That was the hint that led to finding the MTU problem on the provider's link.

JHern
Here to help

One last update: We were also seeing the sporadic WAP disconnects from the Meraki Cloud issue on 28.6.1. Upgrading to 28.7 resolved the issue. No unexpected disconnects since the upgrade.

 

I have confirmed with Support and my SE that the issue was specifically addressed in 28.7. 

 

Net: If you are on 28.X code, upgrade to 28.7.

krice01
New here

Same here. We have 38 MR56 access points currently offline within the dashboard. I can ping the local IP addresses but nothing else. They're coming up the going down intermittently. 

BCC-SAS
Here to help

Same issue here - my MR46 and MR36 running 28.6 are dropping off the dashboard without any real regularity - will show as offline for 45 minutes to 3+ hours, still passing traffic and getting new connections, but it is frustrating from an alerting standpoint.

DetroitJeremy
Here to help

I had a very similar experience (Wi-Fi 6 AP running 28.6 shows offline, continues to server clients, didn't respond to ping, I could get to the local status page of the AP from a connected client, cycling the switch port on my MS switch resolved it or it self- resolved after time).  I opened a ticket and support asked that I don't cycle the port and let them know next time it happens.  It hasn't happened since.  I also have some peers who are also seeing this occur.

EJN
A model citizen

For all those of us experiencing this, if you have an open case with support, please add the URL of this thread to your case notes. You can do so in case management in help menu in the dashboard.

Esteban J Nunez
School and Church
K-12 Education
JGill
Building a reputation

Does sound like they attempted a fix last night (that failed), but at least they are working on it! 

smithm_worc
Comes here often

I got the problem too.  Once I went to 28.6 code it all went downhill, tried 28.6.1 as well with no help.  My case number:  07967754.   It's only my MR56 units.  My MR42 and MR34 units are fine.

EJN
A model citizen

Back again today. 2 APs earlier and 2 APs now, all MR46. I need to cycle the switch port for the MRs device in order to regain cloud connection.

Esteban J Nunez
School and Church
K-12 Education
LilRhody
Here to help

Same, just had 2 go down and had to bounce the port to get it back (they did continue to pass traffic while offline)

krice01
New here

Same. We have at least 4-5 offline alerts everyday. Typically false positives. Very frustrating that Meraki deems these are "backend issues" without additional context. 

smithm_worc
Comes here often

I have a rather large system....had 15 go offline this AM across 13 different networks.   Toggled PoE on their switch port on all of them to regain connectivity to the cloud.  No sooner was I done, 12 more dropped off.   This is nuts.

JGill
Building a reputation

We dropped 33 APs this morning across 27 Network locations.  We opened that case on April 25, This is beyond nuts!  

DetroitJeremy
Here to help

I added a link to this thread in my case (07948308).  I have been lucky as this is occurring infrequently for me and only affects a couple APs at a time.  However, what is the point of a dashboard if you can trust it and its notifications.

JGill
Building a reputation

Adding the link to this thread to our case 07965498 as well.  

 

smithm_worc
Comes here often

Added to my case as well.  Opened on April 25th.   Case number:  07967754

Speedbird1
Getting noticed

Has anyone noticed if  when the MR's stay offline from the dashboard for an extended period that they stop allowing clients to connect ? Had one this morning and had to bounce the port on it 

LilRhody
Here to help

I have not experienced that. When mine go "offline" they still allow connections and pass traffic but the AP and any clients attached to it will appear offline in the dashboard

Speedbird1
Getting noticed

Hmm strange ours were offline for 8 hours, they also showed config not up date. 

different question then do MR's stop accepting clients if they cant contact the dashboard for a set time and be deemed a dead ap or does the last known config apply and the MR carries on serving clients?

 

LilRhody
Here to help

Mine also show "config not up to date" when they go offline. I imagine they hold onto the last good config if they can't contact the dashboard

Speedbird1
Getting noticed

Thanks for the info, this one must be a one off 

EJN
A model citizen

After a few days with no disconnects, today I had about 10 MRs disconnect at different times (morning Eastern time). I don't mind the disconnect as much as the inability to reconnect automatically unless I power cycle the port.

Esteban J Nunez
School and Church
K-12 Education
pjc
A model citizen

I can confirm I'm seeing this on some of my MR44's.  For example one AP showing as down overnight and all morning, and they reconnected to cloud, no intervention on my part.  Other AP's on the same switch all OK.

It would appear looking in the event logs that clients are trying to connect but unclear if successful, however when looking at the performance graph while the AP is reported as down it shows no clients connected during that time, so I don't know if clients have connected OK and it's just not bale to report this in the performance report

 

This all started after I upgraded all of my AP's to 28.6.1 about 4 weeks ago

 

Trouble is I'm unable to roll back firmware as it's over 14 days - Will need to log a call I think if this is actually stopping clients from connecting

JHern
Here to help

I'd go ahead and open a case. There is a known issue of WAPs disconnecting from the cloud occasionally. 

JGill
Building a reputation

Still looking to see if this is a dashboard reporting issue, or a firmware issue.   We know its a Next Tunnel related issue, but looking for a team to stand up and say we own this issue!  Otherwise the only option is to roll back code and accept putting the corrected security flaws back into your production network!   

 

Going dark on the issue is not really an answer.  Anyone from Cisco / Meraki care to take ownership of this issue and share remediation plans? 

 

Anyone up for some round table sessions with Cisco / Meraki Executives at Cisco Live to go over issues, stability and product road maps?   I'm happy to start putting together a group!  

 

 

Holli69
Getting noticed

Same here in Europe with MR44 APs running 28.6 firmware. Screenshot connectivity for last week. Disconnects appear from last Friday on.

Holli69_0-1654629010089.png

 

ajcamacho
Comes here often

following this thread

pjc
A model citizen

I logged a call yesterday as am seeing the issue with some of our MR44's

 

I'm sure support won't mind me posting some of their reply here as follows:

 

" As you're aware (by referencing the community page), there is indeed an issue with 28.x firmware and the Meraki cloud communication. Our engineering team is aware of it for a while now and is working on a solution. Unfortunately, I do not have ETA when it's going to be released."

 

They also said they do not believe that clients are affected either connecting to the AP or connectivity through it while the AP is disconnected from the Cloud Controller. 

 

I have yet to personally verify this, but I see normal client activity in the event logs for the affected AP

 

Hope this helps

Speedbird1
Getting noticed

Hi, Since dropping our DHCP lease time time to 4 hours for the AP's  i have not seen any disconnects for over 2 weeks. (Someone mentioned DHCP renewal earlier so i thought id try this)

Some asked whether the clients can still work through a disconnected AP, mine do i only had 1 AP that didn't but that could have been something else.

JGill
Building a reputation

Guess they spent all the developers time on this new dashboard view vs fixing the actual issue.  

 

JGill_0-1655990266727.png

 

Shawn_Kingston
Here to help

 Hello everyone, we have recently purchased (6) MR46 AP's and put one in testing. It automatically updated the firmware and then we began to experience the same issues listed. We have cycled the switchport but it has not come back up yet. We can ping it, but no response to dashboard. Is there an update for this issue or is everyone still just rolling back the firmware for the temporary fix because I scheduled the upgrade to MR 29.2. It said it upgraded but still will not reach the dashboard. I did read something about the firewall ports used for cloud access changed after 28.X upgrades, could that be an issue?

Shawn_Kingston_0-1661182220074.png

Shawn_Kingston_1-1661182599241.png

 

 

 

 

ajcamacho
Comes here often

The ports to connect to Meraki Dashboard changed with version 28, check in your Internet connection FW if you are allowing the new ones. Check also if you AP's are getting DHCP IP because that was another issue we faced and it was DHCP related.

Shawn_Kingston
Here to help

FW is open for the AP MAC addr to reach Internet via the TCP/443 port (new with v28+). I tried changing from DHCP to a static address but it will not accept the change via the dashboard. I clicked save but it still show DHCP is assigned. I've cycled the switchport to no avail. Before the first auto firmware upgrade and up to about 12 hours after, it was connected to the dashboard. Just suddenly disconnected this morning but would still ping through the network. 🤔

JGill
Building a reputation

28.7 has the bug fix,  seems to be working so far.  29.2 was released the same time, and has notes for the issue as being fixed.   Have not tested 29.2 version but looks promising. 

JGill
Building a reputation

28.7 Appears to correct this issue.  We have over 50% of our AP's rolled, with no new events of units dropping off the network once updated.   More important no "New" issues identified so looks to be a "stable" release 😉

ajcamacho
Comes here often

we are running 28.6.1, over 8K MR46's and no disconnections at all. 

Shawn_Kingston
Here to help

Good to hear, I'll have to try to hard reset the device and see if it will get the new static IP and firmware upgrade pushed to it. It's just dead in the water right now. 🙄Thanks. 

MStorm
New here

Just wanted to chime in on this issue we're facing as well. We have 10 Meraki AP's, 5x MR45 and 5x MR46. All running firmware version 28.6.1. We never had these issues on earlier firmware versions.

 

So far the MR46's are not showing the issue, only the MR45's. One slight difference between them is that the MR46's are connected to non-Meraki POE switch (Netgear to be precise). Not sure if that could have any effect but I thought I'd mention it. The MR45's are all connected to the same MS225-48LP switch on firmware 14.33.

 

So far the issue has only happened at night, no idea if that is just dumb luck or also has something to do with the root cause. Timeframe varies a bit but usually they go offline around 1:00 AM and come back online around 9:00 AM (this is UTC+1/CET timezone).

Shawn_Kingston
Here to help

Yeap, the upgrade to 28.7.1 fixed it for us.

Get notified when there are additional replies to this discussion.