Hi,
I've researched this issue and so far haven't found a similar case, so I've signed up to ask for some advice
I have an MR36 running 27.7.1 connected to an MS250-48LP running in test as a PoC for a network wide roll out
I have a Guest SSID distibuting it's own DHCP, and a Production SSID configured to use the local LAN DHCP server.
In the production SSID, I'm getting the DHCP error in the subject, however, clients are getting a valid IP address from the server, with the correct gateway etc, and a lease visible in the DHCP lease list on the Windows server. However the client claims there is no internet access
The MS250 is currently configured to block rogue DHCP servers but specifically allow the DHCP server in question, it's worth noting that client devices connected to the switch via copper are getting addresses and connecting out without issue, there is no L3 configured on this network
Any suggestions of where to look would be a great help, thanks in advance
Solved! Go to solution.
Well we resolved the issue. We have an ASA in our environment with basic threat detection enabled. When doing a packet tracer on the ASA the client experiencing the issue was Shunned in Phase 3. When looking at the policy we have excluded subnets from the threat detection with auto recovery after 3600 seconds (1 hour). Once we added that in the policy we have not had the issue occur again.
If you run a packet capture on the wired interface of the AP on which the client is connected, do you see the full DORA exchange as a client tries to associate with the network?
It's been a while since my windows days, I assume it should only populate the lease table after the Ack back from the endpoint?
On the client itself, do you see the entire IP assignment configured as expected, IP, mask, gateway, VLAN?
Does the communication only not work if going towards the internet?
Are you able to ping the gateway?
Could you have some L3 block, possibly a missing return route upstream of the MS?
Are the wired clients on the same VLAN as the wireless clients?
Ughhhhh.. meant after the ACK is sent by the server, not the client.
I would start up upgrading to the stable firmware version. You might be fighting issues that no longer exist.
You are a major version out of date.
Is your client device set to pick DNS using DHCP as well or is it manually assigned? If you try manually setting DNS severs does the error go away?
Thanks for the suggestions guys, I'm out of office this week until Friday, so will pick up with your ideas and report back then
Have you checked the L3 rule under Wireless ->> Firewall and Traffic shaping? I have seen that rule block traffic going to the servers on the LAN. Change that deny to allow and see if that fixes it.
Hi Guys, I've managed to put some time aside for this, and to answer some of the questions above, there is no Layer 3 happening on this network currently everything is on default VLAN 1, the Local LAN rule is set to allow, the client gets a valid address but can't ping the gateway, let alone the internet, and there are numerous other AP's with the same config on the same firmware functioning in other businesses under the same corporate ownership. The following packet capture shows the DHCP exchange from the server at x.x.x.2:
--- Start Of Stream ---
reading from file /tmp/click_pcap_dump, link-type EN10MB (Ethernet), snapshot length 9600
14:06:47.108066 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:06:48.607392 ARP, Request who-has x.x.x.87 tell x.x.x.2, length 46
14:06:50.094866 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 313
14:06:50.121799 ARP, Request who-has x.x.x.2 tell x.x.x.222, length 46
14:06:50.122399 ARP, Reply x.x.x.2 is-at xx:xx:xx:xx:xx:xx, length 46
14:06:50.923239 ARP, Request who-has x.x.x.196 tell x.x.x.2, length 46
14:06:58.177514 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:06:58.180722 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:06:58.181458 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:07:01.092245 ARP, Request who-has x.x.x.218 tell x.x.x.2, length 46
14:07:02.023750 ARP, Request who-has x.x.x.205 tell x.x.x.2, length 46
14:07:02.625115 ARP, Request who-has x.x.x.2 tell x.x.x.187, length 46
14:07:02.641458 ARP, Request who-has x.x.x.208 tell x.x.x.2, length 46
14:07:02.643704 ARP, Request who-has x.x.x.187 tell x.x.x.2, length 46
14:07:02.908839 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:07:06.860131 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 313
14:07:06.886681 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:07:08.944501 IP x.x.x.2.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 307
14:07:09.637136 ARP, Request who-has x.x.x.98 tell x.x.x.2, length 46
14:07:12.352409 ARP, Request who-has x.x.x.125 tell x.x.x.2, length 46
14:07:12.364550 ARP, Request who-has x.x.x.128 tell x.x.x.2, length 46
14:07:15.237527 ARP, Request who-has x.x.x.2 (xx:xx:xx:xx:xx:xx) tell x.x.x.222, length 46
14:07:15.238167 ARP, Reply x.x.x.2 is-at xx:xx:xx:xx:xx:xx, length 46
--- End Of Stream ---
Any thoughts!? Thanks in advance....
Hi Martyn,
Did you manage to fix this? We have a similar issue...
Thanks
Mark
Sadly not, no! Still hoping someone discovers this and knows the solution! Comforting to know I'm not alone however!
I have a similair issue where a MS-410 acting as a DHCP server does not fullfill a DHCP request, running Wireshark on both client and switch I can see the DHCP Discover but never the Offer coming thru from the switch
I have been spoken to Meraki support and they recommend by rebooting the switch and see if this resolves the issue. Not ideal as I would much prefer to get a root cause of this issue.
I'm not convinced that this is a DHCP server issue, as we are seeing this on Corporate and Guest SSIDs both of which use different DHCP servers - Corporate a Windows server and Guest DHCP is running on a Firewall.
In for a penny...... I just rebooted my switch, as it isn't technically in a production environment yet, so almost no impact on anyone beyond myself...... Rebooting the switch obviously caused a reboot of the AP and I'm still seeing the same issue on the SSID distributing DHCP from the Domain Server, the Laptop does have a valid address, I've released it, reserved it and then renewed to force the server to issue a new address, but nada, can't ping the gateway or anything else. The other SSID works fine using the AP's own DHCP service, devices can browse out through that without issue
I have access to a "Health" page, under Wireless>Monitor that offers a lot of information regarding indivdual client experience, is this something different?
Yes splash portal is an add-on.
Thanks
Mark
Hi,
If the wireless config was the issue then you would see it on dashbaord when looking at the client page. Assuming that your setup is correct, and considering that the SSID works elsewhere in the business, I would look at the DHCP/ Radius Servers, do we have the users from the affected site in the correct groups and also where they authenticate, what happens after they authenticate ?, ipconfig /all from affected user and can we ping the DNS from affected user and or switch
Also test by setting up the SSID not to use any authentication and have a user connect and test ( Access Control>Network Access>open no encryption )
But before everything, please make sure you firmware is up to date on both MR's and MS's
Hope this read will not be a waste of time lol
Regards
Phil
I'm getting this same error for multiple clients.
This appears to only be happening on SSID's where client IP assignment is configured to Local LAN. I'm wondering if this is simply a software bug caused by having bridge mode enabled.
Outside of the error message, cleints do connect without issues.
We have just changed from L3 roaming to L2 Bridge mode and we are seeing this in both modes. We did have 2 DHCP servers configured, we have recently removed the second one and the errors have reduced but not gone away completely.
Hi Guys, I've been away from the office a lot recently, however I set the AP's to update to the latest firmware on Monday and have just got back in to the office for testing today.....
Switches and AP's all now running the latest firmware, when I said the same setup was running in the other businesses we support, I was referring to the firmware version at the time, and the SSID configuration, the SSID I'm having an issue with isn't functioning elsewhere in the organisation, it is however identically configured to others that are functioning as they should
There is no Layer 3 config on this network, nor is there any Radius, the SSID is set up to use a PSK in Bridge Mode with a single Windows 2008R2 DC serving DHCP. I have tried your suggestion and mirrored this setup in a test SSID with open access (no PSK) and have the same issue;
Client made a request to the DHCP server, but it did not respond.request_ip='x.x.x.219' request_server='x.x.x.2' details='no_dhcp_response' radio='1' vap='2' channel='108' rssi='50'
However, the client has a valid IP address, DNS server entries and gateway ip as well as the correct DNS Suffix, yet can't ping any of the above
How do I go about raising a ticket with support?
Hi Guys, I do not see any solution in this thread, also Meraki support is not knowing this issue.
packet trace must give some answers to all our questions but it happens not often and not always.
We are running an enterprise solution with 2 forwarded DHCP servers in our datacenter using radius authentication, no firewall rules and only seeing the issue on one specific SSID. Never had the issue before for the last 3 years. Both DHCP/DNS servers are up and running and are the most important servers in our environment, so 100% up
Is there an article with the solution?
Hi guys,
We have been suffering with this issue for more than ten months; we have three separate environments in various locations, each running the Meraki solution, all connected to the same radius server via LAN, and each using a different brand of DHCP server and routers.
This problem only affects one place for prolonged time, and it happened only once on another location but only for 1 day.
This issue affects users in a complete random manner, no specific details nor device type/NIC make was detected.
we have 1500+ user in the environment with more than 15 different devices, NIC and drivers.
Following a thorough study and checks, we have discovered the following:
The workaround we have thus far involved is moving the problematic machine to another AP (Roam), trying to connect again and it succeed, then returning to the same location while still being connected, even when reconnecting will be successful.
The device connects and complete the DORA when roamed.
Please let us know if this was useful and if anyone had a solution for this problem.
Support from Meraki was of no assistance at all.
A big thank you!
Regards,
Waseem
-
Hey folks,
This might help only small % of cases but the same error:
Client made a request to the DHCP server, but it did not respond.
type='NO DHCP response' associated='true' radio='0' vap='0'
The MX was removed, there was a modem swap and techs suspected new modem (GW Mode) being an issue. Rebooting did not resolve the issue, firmware wasn't upgraded but upon checking this was naturally related to VLAN Tagging. Everything works fine on simply disabling that, in these type of change scenarios.
_________
Check from:
SSIDs |
VLAN tag
This column refers to VLANs set in Bridge or L3 roaming. For info about VLANs used for VPN, see the VPN column below.
Then Go to:
Scroll down to: Client IP and VLAN
Disable VLAN tagging
(VLAN tagging is used to direct traffic to specific VLANs. To use VLAN tagging, all Meraki APs functioning as gateways in this network must be connected to switches that support IEEE 802.1Q.
The gateways must be connected to switch ports that are configured to accept 802.1Q tagged Ethernet frames (such ports are sometimes called "trunk ports").
If you are unsure, don't enable this feature.)
Hope this helps someone at least. 🙂
@MartynB , Are you still facing this issue or its resolved? I am facing the same issue but its reported by Mac Devices only.
Client made a request to the DHCP server, but the DHCP server rejected the client's request.server_ip='1X.1X.1X.X9' request_server='unknown' details='dhcp_nack' radio='1' vap='0' channel='36' rssi='8'
Same issue. Any Suggestion ?
Have you tried removing VLAN tagging from the SSID in situations where AP wired interface is in the same VLAN as the one you want to tag? So for me I was using VLAN 700 as the LAN and 34 for guest, the AP pulled an IP for itself from 700 as I'd configured on the switch side (trunk native VLAN 700, tagged 34 and 700), but it didn't like having the VLAN tagging set for VLAN 700 in the SSID as well. Removing this fixed the issue right away.
Good Afternoon. Firstly I'm not trying to be patronising or insult your intelligence, I just want to make sure I'm not misunderstanding anything or assuming anything... The configuration of the switchport that the AP is connected to, is it set to trunk or Access and which vlans is it allowed to pass? I think you make reference to it using vlan1, so I'm going to assume that the AP and the client are on the same subnet, lets just say it's 192.168.1.0/24, and the default gateway is 192.168.1.1. I'm assuming that the DHCP server itself is on the same subnet, as you state that it is a local DHCP server, so lets just say that it's got an IP of 192.168.1.2. If the AP is on the same subnet, it is getting an IP of 192.168.1.3 (Assuming that the client and the AP are using the same Vlan/subnet, if not then the access port would need to be a trunk with allowed all or specific VLANs, but you said there was not L3 going on, but is that in reference to the wireless config??). You state that the client gets an IP, so lets say the client gets the IP 192.168.1.4. You said that it can't ping the DG (Which I've said is 192.168.1.1 in my example). I assume this is a router or firewall or something else. I'm also going to assume that the DNS values are NOT on the same subnet as everything else, lets set that too 192.168.2.2.
Some pings would assist. Can you ping the client from the DHCP server, or from the AP? Can you ping the DHCP or AP from the client? The 'No Internet' indicates that you can't route out to the internet, yet I'm assuming that you can from the DHCP server. Does the Default Gateway which is assigned to the client have any ACLs which restrict access out to the internet or any destination? I assume if you perform a traceroute out to 8.8.8.8 you get no response from any hops along the path from the client endpoint?
Glad to find this post but sad to see no resolve. Same thing happening on our Meraki AP's. AP's are setup with vlan tagging because we broadcast SSID's for multiple subnets for public, guest and corporate networks. Each with different DHCP servers. All do the same thing and only noticed it this year since school started and hundreds of kids teachers are on the network. Many report no WIFI connection just to report back 15 minutes later that they are connected. On the wireless DORA is sporadic. Some clients take right off others timeout and takes a few tries to get a successful IP.
My packet captures from a client, when the problem exists, show many DISCOVERS that continue until it either times out or finally gets an OFFER. Worked with Meraki support and they seem to think it's something in our switching. However I can hardwire the same client and DORA is instant every time. Besides no wired devices have the problem. I'm on the latest firmware but also tried an old firmware but problem is still there.
Very upsetting to not have an answer when Meraki has been so rock solid performing the past 10+ years we've had them. I cannot all of a sudden have 4 non-responsive DHCP servers on 4 different subnets doing the same thing for wireless clients only. The common denominator is the AP's.
If I find a solution I'll post here and hopefully others will do the same.
Hey, Maineiac15, did you find a fix for this issue? I am seeing the same issue at my school as well.
Unfortunately, no. We just have a band aid in place. Support as well as a hired engineering company couldn't figure it out. We got tired of the finger pointing and getting no where so implemented a temp fix. We placed a DHCP server right on the vlan for our student Chromebooks because that is the network with the largest number of devices which of course causes the most complaints. That has solved it for that one vlan.
Of course this doesn't solve it for all the other vlans but it solves 90% of the complaints. 😞
Let us think in a different way, perhaps the issue isn't from Meraki, may be its a Windows/NICs related issues,
I have seen multiple posts regarding Intel's new AX wireless NICs that only have DHCP issue with RADIUS networks.
In my environment, we have AC8xxx / AC9xxx / AX201 / AX210
We had in hands a machine that is unable to obtain IP, we tried to connect using an external USB TP-Link adapter, it worked instantly, under the same SSID, and same AP.
Unplugging the USB and reuse the internal NIC, the machine is unable to obtain IP address.
Dears, please share your environment used NICs, are they all Intel's, what model?
Very good thought WaseemD. The only problem with my issue is it's happening with our Chromebooks which only use PSK as well as our HP and Dell laptops which do use Radius. These same devices didn't have this issue a year ago.
Figured it out, Meraki Layer 3 firewall we use to "allow" specific IPs such as Radius servers/DNS/Gateway and "deny" any by default.
the error in configuration that the broadcast IP 255.255.255.255/32 is not allowed, nor Ports 67/68.
adding these to the allow list in the firewall policy will solve the issue.
If we go back and check Meraki's recommendation is to "Deny" any unwanted subnets, and keep the Allow as default.
But this does not work for me, as we need to also block public IPs before .1x authorization response.
Thanks
We're experiencing this issue, but it seems to be on the 9164 APs we have in our office area. I am not sure if the issue is isolated solely on those model of AP or if it's because we do not have *that* many wireless clients on the older APs. One question I have, are any of you running off NX-OS? I was looking at the interface from the IDF switch to the Nexus switches interface, and there are a noticeable amount of input and output drops as well as Giant frames on those interfaces.
Our clients can reach the SVI (nexus), but cannot ping to the firewall when the issue is happening, so I have to think there's something happening. The Nexus interface as well as 9300 interface both have the default MTU, and was thinking that maybe the MTU size needs to be increased.
Well we resolved the issue. We have an ASA in our environment with basic threat detection enabled. When doing a packet tracer on the ASA the client experiencing the issue was Shunned in Phase 3. When looking at the policy we have excluded subnets from the threat detection with auto recovery after 3600 seconds (1 hour). Once we added that in the policy we have not had the issue occur again.
Hey guys, I just saw a notification for an update on this thread, and thought I ought to come back and at least share my own findings since I started it!
The network I was working in whilst encountering this issue had a very unusual IP address schema implemented by it's creator many years ago, this being 6.x.x.x
Turns out, this is the IP address schema that Meraki are using on their back end for communication protocol between AP's and switches, and therefore, can't handle it as the LAN schema.
No resolution in this case, but we are almost done migrating this network to fresh infrastructure with a corporare aligned address schema.
Not that I imagine this helps anyone else, sorry!