SSID 802.1x issues - Messages Discarded for no apparent reason.

coleslaw
Conversationalist

SSID 802.1x issues - Messages Discarded for no apparent reason.

I'm scratching my head over a recent issue that appeared while upgrading one of my NPS servers.

 

I had removed one of the servers from the configuration to allow my self some time to finish the config. By the time I was finished and reintroduced the new server I started seeing loads of message discarded events. 

Of course i suspected the new server but looking through the logs in elastic I could see that requests started failing for that site on both machines..

 

The server has the same name, same config, and same IP as before. Running the same OS as the secondary node. No changes to VPN path etc ( I started suspecting MTU issues but seems unlikely ) The current config that was applied hasn't been changed in years, everything has just worked so I'm starting to suspect issues with changes to the dashboard or firmware?

 

I tried changing back to the "old view" for the SSID configuration and applying the same settings but errors kept piling in.

 

Is this something that will self heal? Is it just the configuration lagging to propagate properly in my network?

 

Has anyone seen this before?


7 Replies 7
alemabrahao
Kind of a big deal
Kind of a big deal

Have you checked your NPS logs? They usually indicate the problem.

Generally, the problem can be due to certificates, policies, etc.

 

 

https://directaccess.richardhicks.com/2022/08/08/always-on-vpn-nps-auditing-and-logging/

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
coleslaw
Conversationalist

Yes, that's the logs I'm checking. The NPS basically just says The RADIUS Request message that Network Policy Server received from the network access server was malformed.

 

Looking at the message everything looks normal. Called Station ID is okay, correct policy is hit.

 

I checked now and the strange thing is sometimes I can se that access is granted, but the request itself looks exactly the same.. I've tried re-applying the config several times and even updated the client secret just to make sure. I just can't seem to find what the cause is.

alemabrahao
Kind of a big deal
Kind of a big deal

Perform a packet capture on the upstream switch/AP VLAN and check for RADIUS packets with lengths different from the actual captured packet length, verifying the decoding of malformed RADIUS packets in Wireshark and checking for UDP fragmentation (common in EAP versions that exceed the MTU).

Even a VPN path that hasn't changed can still experience fragmentation if the new NPS server has a slightly different TLS handshake length.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
RWelch
Kind of a big deal
Kind of a big deal

Not sure if this might be of help to/for you....

RADIUS Issue Resolution Guide 

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
BlakeRichardson
Kind of a big deal
Kind of a big deal

If both servers are seeing the same message it's unlikely to be your server config. Are you using radius proxy or have you configured each AP in your NPS config? 

 

Is there any firewall in between your AP's and the NPS servers? 

If you found this post helpful, please give it Kudos. If my answer solves your problem, please click Accept as Solution so others can benefit from it.
coleslaw
Conversationalist

I'm not using any proxy, I have the radius clients configured in the NPS by network for the management lan. I created a new NPS secret yesterday just to make sure there was no whitespace och special characters etc but no change.

 

We have a VPN to 2 VMXs in an Azure WAN with secured hubs. But this setup has worked without issues for at least 3 years.

 

When I got in this morning the whole thing got a lot more strange. I had updated 2 sites yesterday, this morning I updated 2 more. Now all of a sudden the issue has shifted to the NPS server that was working flawlessly yesterday.

 

So now my new server is granting access like nobodies business but the other one is discarding..  I'll have to do some kind of packet capture deep dive but now I'm 90% certain something fishy is going on in my Meraki dashboard..

PhilipDAth
Kind of a big deal
Kind of a big deal

Make sure the RADIUS keys have been restored correctly.

Get notified when there are additional replies to this discussion.