Disabled switch (bad DNS) - no internet connectivity for all devices

Marcelino
Comes here often

Disabled switch (bad DNS) - no internet connectivity for all devices

Hey all,

 

so we have this problem in a network where we have Meraki MX and behind MX Meraki switches and behind those HP / cisco etc. switches and many Meraki and Ruckus APs in a hotel

 

This is the 2nd time this has happened, when a group of usually Americans come and start hosting meetings all the Meraki switches start going into Disabled switch (bad DNS) mode and what happens is that anyone connected to any switches or APs behind them has working DNS, so basically "no internet access", even though the actual VLANs are using different DNS addresses on different VLANs than the switches management etc.

 

Disabled switch (bad DNS) occurs every 10-15 minutes and stays for 2-5 in Disabled switch (bad DNS) mode and while in Disabled switch (bad DNS) mode, no DNS queries work so the customer says "no internet access"

 

Disabled switch (Bad Dns).JPG

Close up of the pumping DNS:

Disabled switch (Bad Dns)2.JPG

 

 

- I found nothing unusual from clients or traffic, except maybe 20-40 clients connecting to their VPN

- I have swapped the switched management DNS settings from ISP -> Google -> Internal but the problem persists

- I have switches management static and DHCP but the problem persists

- I have changed switch MTU from default to 1500 but the problem persists

- I found no other DHCP servers within the network

- I found no new devices connected to LAN via ethernet, so it must be via WLAN (Ruckus and/or Meraki MR)

- I disabled RSTP on Ruckus AP ports but the problem persists

- I have tried IGMP snooping and flood unkown multicast traffic enabled / disabled but the problem persists

- Firmwares are up to date and meraki support has gone through the settings and found nothing, only that it is "ISP problem", which it is not. Problem starts and ends as soon as the group starts working.

 

So what seems to happen is something is causing all of the DNS traffic to pump in 10-15 minute cycles and it seems to affect only up to switches (MS-220 series) (not MX) and everything behind switches. Management VLAN 2 (where the switches are) and also all traffic VLANs.

 

Has anyone come across this or anything like it?

12 Replies 12
ww
Kind of a big deal
Kind of a big deal

https://community.meraki.com/t5/Switching/Disabled-Switch-BAD-DNS/td-p/32382

 

so maybe check the logging of the other vendor switches for any clues. 

 

 

Marcelino
Comes here often

All of the customers with problems we're directly connected to APs (Ruckus and Meraki MR) behind Meraki PoE switches, lets say that the HP switches etc. are behind a fiber link elsewhere and leave them for now.

We also checked that the IP settings were up to date locally on each switch without duplicates.
PhilipDAth
Kind of a big deal
Kind of a big deal

A long shot but perhaps this is a spanning tree issue and something is creasing the root.

 

Have you given whatever is the core switch in your network a low spanning tree priority, like 0?

https://documentation.meraki.com/MS/Other_Topics/Switch_Settings

Marcelino
Comes here often

Yes, the main meraki MS220-48LP right after firewall is set up as bridge priority 0 - likely root. RSTP is enabled but while I was testing I disabled RSTP from Ruckus AP ports, problem still persisted.

HitoshiH
Meraki Employee
Meraki Employee

If the issue is reproducible in the environment,

The one of step to troubleshoot this issue is that taking packet capture to see where DNS query / answer is dropped between device reports the issue and the DNS server, because the warning message (Bad DNS) is shown up when the device is unable to receive answer from the configured DNS server.

 

As you may know, taking packet capture can be done on the Meraki dashboard from Network-wide > Packet capture > select target switch and the uplink port.

You can have a look if DNS query is sent from the device and answer for the query comes back properly to the device or not.

If DNS answer is not seen in the packet capture, moving this forward closer to DNS server end would be the idea.

 

Hope this process would help to find out the root cause of this issue.

 

~~If you found this post helpful, please give it kudos. If my answer solved your problem, click "accept as solution" so that others can benefit from it.~~

The Meraki ECMS exam is now live! Test your knowledge of Meraki and become an official Cisco Meraki Solutions Specialist. More info on the ECMS exam found here.

For information regarding all of Meraki's training offerings, be sure to check out the Meraki Learning Hub.
PhilipDAth
Kind of a big deal
Kind of a big deal

@Marcelino although the DNS is failing - I bet it is not a direct DNS issue, but something else.

 

For example, DNS rate limiting some where, spanning tree issue causing packets not to forward, duplicate IP address knocking something out, etc.

 

What I'm saying is - don't focus too tightly on just the DNS.  Look wider for other issues.

Marcelino
Comes here often

Problem start and ended with the group, I think they were using alot of VPN connections to their server.

We did a live packet capture with Meraki support, only pings were sometimes unable to reach DNS servers.

 

I dont think that DNS is the problem, its the cause. Also ISP checked their fiber connections and router and found no errors.

 

The thing is, what could cause it? Multible VPN connections from WLAN to the US causes Meraki switches to go bonkers?

PhilipDAth
Kind of a big deal
Kind of a big deal

Did you have spare Internet bandwdith at the time -  or did your Internet circuit get flat lined?

 

What model MX do you have, and what as the total number of clients you have?

 

If you go Organisation/Overview and select just the network for your appliance - what was the device utilisation like?

Marcelino
Comes here often


@PhilipDAth wrote:

Did you have spare Internet bandwdith at the time -  or did your Internet circuit get flat lined?

 

What model MX do you have, and what as the total number of clients you have?

 

If you go Organisation/Overview and select just the network for your appliance - what was the device utilisation like?


We have a 500/500M fiber connection with abaut 297 client devices at the time using MX84 with balanced threat protection rule sets. Max peek being at 60Mb/s and usually below 20Mb/s.
Marcelino
Comes here often


@PhilipDAth wrote:

Did you have spare Internet bandwdith at the time -  or did your Internet circuit get flat lined?

 

What model MX do you have, and what as the total number of clients you have?

 

If you go Organisation/Overview and select just the network for your appliance - what was the device utilisation like?


Also the utilization from MX peeked at 25%, mostly at 15% ish
JorgeQ12
Conversationalist

Hello, I do not know if you have managed to find the cause of this problem, but I have the same situation. In a school center, at the moment the staff begin to connect through Microsoft Teams, the network becomes unstable, losing internet connection, specifically through dns, and errors are seen in the meraki dashboard, indicating Disabled Switch (Bad Dns).
The firewall in front of the MS250 is a Forti which had policies with security profiles of web filtering, ips, etc. By removing these security profiles, traffic flows smoothly and errors on the Meraki dashboard are eliminated. Even though the DNS filter profile was not enabled.
We are now in the process of detecting which of these profiles may be causing the problem.

Greetings.

Dan_G
New here

I have this now. 3rd party FortiGate FW's terminating the outbound lines in front of Meraki Switches. Intermittent 'bad DNS' on multiple switches and Orange lights come and go on the front of the boxes as a result. Filtering is evident from a management laptop placed on the Management VLAN. I guess the Forti's are doing some rate limiting and killing the outbound DNS sessions on some of the switches depending how many sessions hit the filter at the time.

Thanks for posting your experience this should allow me to get this addressed before all the users jump on and start moaning.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels