Troubleshooting MX > Switch configuration

JoshuaK
New here

Troubleshooting MX > Switch configuration

Wondering if someone could weigh in on this scenario, as I'm confused what happened.

Has anyone else run into this situation?

 

Network Info:

 

What we have:

[MX]----(Trunk - Allow All/Native VLAN 2)----[SW]---(Trunk - Allow All/Native VLAN 8)----[AP]

 

 

I can provide more info of course, I just have to be careful since this is a customer network.

 

  • All Meraki gear - MX84, MS225-48LP, and MR33's.

  • MX is static of course, the Switch and AP's were static, but since we rolled back, we left them as DHCP for the next attempt.

  • DHCP is at the MX, DHCP scopes are defined for the old and new VLANs.

  • Using DNS servers located at a data center on existing and new VLANs.

  • New VLANs are enabled on the VPN tunnel at the MX.

  • Switch Mgmt. VLAN on Dashboard is configured as VLAN 2.

  • All IP's are in completely different subsets, and all /24.

  • The switches are not routing, nor have any L3 interfaces. Strictly layer 2.

 

 

The goal:

 

Change VLANs from 2 & 8 to 21 and 81, and update the subnets as well. Basically just replicating what we already have, but with different VLAN's and subnets:

 

[MX]----(Trunk - Allow All/Native VLAN 21)----[SW]---(Trunk - Allow All/Native VLAN 81)----[AP]

 

 

 

Troubleshooting:

 

 

I started at the AP's, and since they were set to static IP, I converted them to DHCP and then cycled the corresponding switchport to reboot them.

 

They rebooted, came up with an IP in the DHCP range for VLAN 8 as expected.

 

I then changed the native VLAN for each AP, waited a moment, then cycled the port once again to reboot each AP. As expected, they rebooted and came back with an IP in the DHCP range VLAN 81.

 

The problems started when I started to move upstream to the switch; when attempting to change the native VALN from 2 to 21 on the uplink port, we lost connection with the switch temporarily, but then received the error on the dashboard that DNS was misconfigured. On the MX side, we configured the port connecting to the Switch's uplink port to native VLAN 21, trunk, allowing all VLANs. Again, basically the same, just a VLAN change and IP change. I should also add that the switch was set to DHCP at this time.

 

The DNS servers the switch would have used are located on the remote end of the auto VPN tunnel (this is one of many sites that connect back over VPN to a central MX in a data center), but they are the same DNS servers being used on the AP's we changed downstream from the switch, and the AP's were working fine during this time.

 

There is no overlap with this sites subnets and any other site's subnets.

 

VLAN 21 & 81 are enabled on the VPN.

 

I had Meraki support on the line and they confirmed that the switches were sending the DNS queries for the Meraki dashboard, and confirmed through the Pcap that the MX was receiving responses from the DNS servers as expected, but that the LAN interface back to the switch was not showing that they were passing the DNS responses back to the switch. Again the same configuration here.

 

We changed the native VLAN back to 2, and the switch worked fine. At this point, I reverted all changes and backed out of the maintenance until I could figure out what was going on.

 

Currently the switch is still on DHCP until we try to cutover again.

 

Did this happen because of the SW Mgmt. VLAN still be set to 2 and not being updated to 21?

 

 

The Rub:

 

No cellular fail-over, and because of physical distance, we cannot be on site to physically troubleshoot the switch. We have a WattBox out there to help with power cycling, but that won't resolve a switch pulling down a incorrect configuration from the dashboard. And it's offline from the network (apparently for the last 5 months)

I'm thinking this is a simple misconfiguration, but my experience in Meraki is not all encompassing, so I'm reaching out for help. Any insight is greatly appreciated.

1 Reply 1
GIdenJoe
Kind of a big deal
Kind of a big deal

I have seen your post on Reddit and I'll answer the same here.

I think you are trying too change too many things at once causing a potential order of operations issue.

You should start with changing the native VLANs towards the AP's (assuming you don't configure mgmt VLAN on AP's of course) and let them come online again on their new VLAN.  Maybe verify the DNS queries/responses towards the AP's are going both directions.

Then before changing the native VLAN config between MX and MS I would first try to change the mgmt VLAN on the switch and see if it comes up with or without the DNS issue.  So make sure your VLAN 21 is passed between MX LAN ports and MS uplink port.  Maybe verify you on the switch you can actually see the MX MAC address on VLAN 21 before you actually change the MGMT VLAN.  Then do all your DNS testing with captures just to see if everything else works (VPN is passing the requests responses).

Then finally if that step works change the native VLAN on the switch first and wait for it and then do the change on the MX too.  If that fails there is clearly a layer 2 issue.  Normally if you don't have secondary links or weird links between HA pair MX then you shouldn't have these problems however it can sometimes occur that a change does not fully go through on the switch without rebooting it causing dashboard to think the VLAN config is correct but the switch itself is not acting like it.

There should be someone onsite that has the ability to place their pc/laptop in the 1.1.1.0/24 range so you can try to locally reach the switch on 1.1.1.100 and login locally to check what area the switch is failing and if needed to change the uplink native vlan locally.

Get notified when there are additional replies to this discussion.