We have Meraki MRs and MXs throughout the state of California and it's working fine, but we're looking at implementing certificate based 802.1x authentication and it's working at our remote sites, which have MXs and DIA circuits, and our testing there is successful, but not so much at our HQ. Our HQ has a point to point gigabit ethernet circuit and has a direct connection to the datacenter, which is also where all of our devices tunnel into to gain egress to the internet. At HQ, there is no MX since it's a direct fiber connection to the AT&T switch and on the other side of that 10G interface is our core switch.
That being said, we have MRs and MS switches at HQ, but no MX and 802.1x auth is not working. We've been banging our heads against the wall trying to figure this out, but I'm wondering, is there some kind of encapsulation that we're missing out on and that's why it's not working? We can verify routes to and from the NPS server, the clients, and MRs from all directions. We've confirmed via packet captures and logs that the attempts are there. But we're running out of things to try. Does anyone know if we need an MX at our HQ building to make this work?
The 802.1X authenticator can be either the MX , MS , MR depending to what the client is connected to.
So , no you don't require a MX to make that work but the Authentication server has to be reachable.
We've verified routes to and from the NPS server and we have seen successful connections from our remote sites and failures from our HQ, which confirms connectivity to and from the clients, MRs, and NPS server.
What do you see in logging on the NPS?
Are you seeing an authentication request come in from clients at HQ? And if so, are they hitting the correct policy?
Also, do you have the IP addresses of the HQ Meraki AP's/Switches configured as clients in the NPS configuration?
We are seeing authentication failures for HQ clients, which could have been either the clients not sending the cert or maybe the packets being malformed. We have added the IPs of the MRs as clients in NPS and that's verified as working. Also, we had a different RADIUS profile that works fine but is using a username/password combination as the auth and we want to migrate to a certificate based auth model.
From what you've described, it sounds like it's definitely not a connectivity or client<-> server trust issue.
Are you seeing the HQ clients actually hit the correct policy (ie. matching the conditions)?
Assuming you do, what is the error in the event log? It should indicate whether it's a certificate expiration or certificate trust or other issue.
The same laptop that authenticates fine at the remote sites are having the issue at HQ. We've disabled all of the other policies and if I'm honest, we're not even seeing anything in the event logs for failures, we've been confirming connections from within the NPS logs.
Because it works outside of HQ, I can only assume that the trust is in place otherwise.
Can you send a screenshot of the connection error you see from the NPS logs when the client attempts to authenticate from HQ?
The NPS logs are not showing any errors, Reason-Code always returns 0 for both remote sites and HQ. That being said, the failing connections are never getting an Access-Reject packet.
For the remote sites, we are seeing the Access-Request and Access-Challenge packets followed by an Access-Accept. For HQ, We are only seeing Access-Requests and Access-Challenges over and over, but never an Access-Reject. When we look in event logs, we can see successful connections from remote sites, but never the failed attempts from HQ. We only see those in the NPS logs.
At one point, we made a change to the MTU just to see if clients are seeing the change. We can see the MTU change reflected in the logs from remote sites, but not from HQ.
Here's a success from remote site
Here's a failure from HQ (over and over until a timeout I assume)
Can you see in pcaps towards your clients that the TLS session is fully formed? Is it an EAP-TLS or an EAP-PEAP with certificate inside?
In case of the latter is your TLS being built or are they hanging a bit?
Apologies for my lack of understanding on this topic, but I do not believe the TLS session is being formed. It's EAP-TLS.
Pcap taken from the AP within the Meraki interface, no filter expression, then filtering by EAP when viewing pcap in Wireshark. We are seeing Identity Responses, Client Hello, and TLS EAP Responses - but never a Server Hello. It reverts back to Identity Responses and repeats the cycle.
If it is pure EAP-TLS you should indeed not see a tunel form. But in case if your client would send a legacy-NAK and goes for EAP-PEAP then you have a different session type and then you do have to troubleshoot if they fail to negotiate a tunnel (TLS version, cipher suite etc..)
However if it is just an exchange of certificates then you could run into the MTU issue pjc is mentioning below or that the client is not trusting the server cert.
Thanks for your insight, Joe.
So far, we have tried adjusting the MTU multiple values between 1100-1400. All values have worked for remote clients, but NPS logs show HQ clients are not even seeing these changes to the policy, only the working remote clients. The HQ clients are not getting far enough in the process to apply the MTU.
I would create a NPS connection policy just for the HQ access points and add a MTU entry (try different MTU packet sizes) within the connection policy just for the HQ. I've seen issues before with radius timeout when using EAP-TLS/Certs at some of our SD-WAN sites where those sites we fragmenting the oversized packet. I ended up having to use the Meraki cloud radius proxy as the middleman to get it to work at those sites
We have been adjusting the MTU between 1100-1400, but unfortunately the failing HQ clients aren't seeing any of the changes we make to the Network Policy - it seems they are not getting past the Connection Request Policy. Thank you for the suggestion, we'll take a look at the radius proxy.
Can you see in pcaps towards your clients that the TLS session is fully formed? Is it an EAP-TLS or an EAP-PEAP with certificate inside?
I would create a NPS connection policy just for the HQ access points and add a MTU entry (try different MTU packet sizes) within the connection policy just for the HQ. whatsapp mod