All switches get disconnected after MX upgrade to 15.42 and 15.43

Solved
Network_user55
Here to help

All switches get disconnected after MX upgrade to 15.42 and 15.43

Hello,

We have a crucial issue with the Meraki MX upgrades which causes all of our switches in the network to disconnect from the network and no user can connect to the network anymore. It has happened three times now. The MX is still online and operative and rolling back to the version we came from solves the problem. But we cannot upgrade the MX anymore due to this issue. I am writing here to see if anyone else has had the issue as we are not getting any significant help from Meraki support. The exact same issue is for all of our Meraki locations so during the last automatic software update from Meraki all networks were down the following morning. 

 

I do not have any debugging messages to send as the info from the dashboard are usually limited and Meraki support didnt really see much either. 

 

I know this is sort of a longshot due to the lack of info I send here, but maybe someone has any idea after all. Hopefully someone has experienced the same issue. 

 

We are currently at version 15.37.

1 Accepted Solution
Network_user55
Here to help

@cmr @PhilipDAth 
Thanks for the feedbacks. The solution was different though and I will summarize here in case someone else has the same problem.

When upgrading to version MX 15.41 or higher the DHCP requests from the MX to the switches are not sent from its MAC address anymore. It now sends all requests from its virtual MAC address and the switches only accepted DHCP requests from the MAC addresses of the physical MX. The settings to whitelist the virtual mac address on the switches can be found here: Switch -> DHCP servers & ARP. 

 

 

To find the virtual MAC of the MX, do the following conversion:

The first 3 octets of the VRRP LAN are always the same (cc:03:d9). The last three are the last 3 octets of the MAC address of the primary MX’s interface. on the LAN side, it would be the MAC address listed in the dashboard. on the WAN side, it would be the MAC address plus whatever the uplink offset is.


if your primary MX’s MAC address is 11:22:33:44:55:66, then the VRRP Virtual MAC would be cc:03:d9:44:55:66.

 

When we whitelisted these virtual MACs the upgrade went fine and the switches did not get disconnected. 

View solution in original post

11 Replies 11
PhilipDAth
Kind of a big deal
Kind of a big deal

When the issue happens you need to go to the local status page of an MS and see why it is reporting it has gone offline.

 

Other things you can check to make it smoother - make sure the MX has only a single connection to your switches to avoid spanning tree problems.

Network_user55
Here to help

Thanks. Didnt really see much last time at the local status page but will try again. Currently we are using stacked switches with two ports connected from the MX. That is, one connection to one switch in the stack and another connection to another switch in the stack

Network_user55
Here to help

@cmr @PhilipDAth 
Thanks for the feedbacks. The solution was different though and I will summarize here in case someone else has the same problem.

When upgrading to version MX 15.41 or higher the DHCP requests from the MX to the switches are not sent from its MAC address anymore. It now sends all requests from its virtual MAC address and the switches only accepted DHCP requests from the MAC addresses of the physical MX. The settings to whitelist the virtual mac address on the switches can be found here: Switch -> DHCP servers & ARP. 

 

 

To find the virtual MAC of the MX, do the following conversion:

The first 3 octets of the VRRP LAN are always the same (cc:03:d9). The last three are the last 3 octets of the MAC address of the primary MX’s interface. on the LAN side, it would be the MAC address listed in the dashboard. on the WAN side, it would be the MAC address plus whatever the uplink offset is.


if your primary MX’s MAC address is 11:22:33:44:55:66, then the VRRP Virtual MAC would be cc:03:d9:44:55:66.

 

When we whitelisted these virtual MACs the upgrade went fine and the switches did not get disconnected. 

cmr
Kind of a big deal
Kind of a big deal

Excellent news that you fixed it, I did post the update about the MAC address from the release notes but didn't realise the relationship to this issue.  That was a good piece of detective work there 😎

If my answer solves your problem please click Accept as Solution so others can benefit from it.
DarrenOC
Kind of a big deal
Kind of a big deal

That’s a strange one which we’ve not come across.

 

we have customers running 15.42 on their MXs and their downstream networks are running fine.

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
cmr
Kind of a big deal
Kind of a big deal

@Network_user55 we have 9 switches behind MXs running 15.43 (and were running 15.42 before then).  Our switches that are behind the 15.43 MXs are running 14.22 (eight 120, 210, 220, 225 models) and 14.27 (a 120) that was just upgraded from 14.26 today.  These switches are all L2 only, but we had L3 switches (MS210s) behind MXs running 15.42 before those MXs were upgraded to 16.11

 

What switch models are you using

How are they interconnected

How is the management traffic routed and is the IP dynamically or statically assigned

What license do you have on the MXs

Are the MXs dual connected to the switches and if so try disabling one of the connections

If my answer solves your problem please click Accept as Solution so others can benefit from it.
Network_user55
Here to help

Thanks for the reply @cmr . Will try to answer the questions here as good as I can.

 

Switch models: MS225-24P/LP, MS225-48P/LP. 

 

Interconnections: I have added a screenshot of the topology. Rather basic I would say. The switches in the middle are connected through stacking. The MX has two outgoing connections, one to one of the switches in the stack and another to another switch in the stack. 

 

Management traffic: The LAN IPs on the switches are assigned by DHCP. Not sure what you mean about routing in this case. Management traffic between the MX and the switches are on the same network.

 

License on MX: Advanced security - expires in 2024. 

 

MXs and dual connections: See answer above regarding interconnections. You mean the idea is to cut one of the two connections? That is, even though they are going to separate switches?

 

Thanks again!

2021-08-17 06_19_58-Window.jpg

cmr
Kind of a big deal
Kind of a big deal

Thanks @Network_user55 it looks like you have an HA pair of MXs, can you please clarify, is each MX connected to two switches, or is one MX connected to one stacked switch and the other MX being connected to a separate switch in the same stack?

 

If the former then please disconnect one cable from each MX and try again, if the latter then that isn't the problem.

If my answer solves your problem please click Accept as Solution so others can benefit from it.
Network_user55
Here to help

@cmr  thanks.

The current setup is like the former: In the stack there are three switches. Two of them are connected to the MX`s in the following manner:

 

1) SW1 -> MX 1 and MX 2. Both are in a forwarding state.

2) SW2 -> MX1 and MX 2. Both are in a blocking state.

 

So the suggestion would be to disconnect each link on (SW2 -> MX1 and MX2) ? Wouldnt that cause a lack of HA in the setup? Also, why would that cause an issue after the upgrade but not before the upgrade?

Thanks again! 

cmr
Kind of a big deal
Kind of a big deal

@Network_user55I'd disable the port on switch 1 for MX2 and the port on switch 2 for MX1.  That way you keep switch and MX redundancy without risking spanning tree not working when the units reboot.  I have had too many issues with reboots causing issues when double connected to risk leaving them all enabled...

 

It might make no difference and if your ports connecting to the MXs on the switches are set in trunk mode then you might be okay without making this change, but it's worth a try, even if you re-enable the ports after the upgrade.

If my answer solves your problem please click Accept as Solution so others can benefit from it.
Network_user55
Here to help

@cmr  thank you, I will suggest this. The ports are set to trunk mode already but it`s worth a try. 

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels