Hi everyone, my company has a bunch of branch locations all over the country. Most of the branches aren't very big, many have just 1 to 3 people. Therefore we have just a Meraki SD-WAN device for them to be VPNed into our internal network at HQ. We have a Cisco ISE at HQ that handles our 802.1x authentication and all these branch locations are also using the ISE. Most of the time it works fine, but sometimes the ISE has error messages that the supplicant stopped responding. I saw this happening with a larger branch that has 2 internet providers so I went into traffic shaping and sent all traffic destined for the ISE to be sent to the other ISP, immediately authenticated without issues. I then moved the traffic to the other ISP, started failing again. It seems when the issues are happening it's because of the ISP being to slow on some days. We don't always have the best speeds for the locations with just a couple of people, and like I said, most of the time it works; but when it doesn't then we get a lot of complaints and of course the employees there can't work.
I'm starting to think it might be good to not have the authentication go over the VPN because of this. However we currently authenticate with certificates (no username/password), so I can't just add the users to the network and use Meraki authentication. Have any of you had this issue? Is there any other way that we can make 802.1x authentication work a little better in these scenario? Is it at all possible to make the supplicant of the 802.1x authentication be a device that is inside our internal network? Maybe there is a configuration that I can do that would help but don't know about. Just looking for any suggestions or experience that someone can give. Thanks in advance!
From my point of view, it'd be rather odd that the speed is too slow for handling authentications. Whenever I had to deal with certificate based 802.1x, it came down to fragmentation / MTU issues.
Perhaps that's another topic to consider if you're willing to discuss.
I'm definitely willing to discuss anything that could be the cause. If what you are saying is true then it would be the ISP fragmenting it, right? Because if it works sometimes then it wouldn't be my devices fragmenting it. We will go months without issues and then randomly 3 branches won't be able to authenticate. What could I even do against that?
Because we are a MS/Office/Exchange/OneDrive/Azure based organisation (and network hardware is not always the same at partner sites), we have deprecated the use of all local servers and storage. Virtually all communication is "up-and-down", rather than laterally; this is remarkably liberating, and simplifies many previously involved use cases. I can strongly recommend it, at least on a trial basis.
One of the tools we are investigating is the YubiKey, which facilitates authentication, including 802.1X. I'd say it is worth a look, https://www.yubico.com/ , they are used by an interesting mixture of organisations including Microsoft, CERN, Google, UK Government, US State Government and Novartis, amongst others. 802.1X may require some stuffing around, depending upon how it is implemented.
Seems it might be a fragmentation/MTU issue. But I think the ISP would still be to blame for that, unless there is something that I can do on my end to reassemble a fragmented packet before it reaches the Radius server.
If you suspect fragmentation, it would be worth checking that you are maximising packet size (outgoing), from the client device, through the LAN/VLAN and out the gateway appliance/modem. It is possible that if the problems appears to be ISP related that the packet sized used to return handshaking packets is determined by the size of those sent out when the handshaking is initiated.
Sorting this out can get tedious, especially as different manufacturers/brands define packet sizes.contents differently.
Why don't you define a performance class and let the MX choose the best ISP automatically? There is an example for Web traffic.
What is actually doing the dot1x? A switch behind the MX, an MX, an MR, something else?