I have 1 MX84 and 12 MS120-24P switches that I'm setting up for a new install.
On September 22nd, our DSL Internet Link went down.
On September 24th, the DSL router was rebooted. Once the MX84 came online, Switch 1 flooded the network with broadcast traffic to all the switches, creating a broadcast storm. All the switches went offline and didn't come back up until I rebooted Switch 1. (Meraki Support is stating this is what happened.)
Has anyone run into this before? We're running 10.35 on the switches. Meraki had me do port mirrors from all the uplink ports on switches 1 & 7 to an access port on those same switches. They asked me to disconnect the WAN link to the MX84 for 24 hours and then connect it back to see if we can replicate this issue.
The MX/s do not participate in spanning tree so it may be possible it was the cause of your issue. Just an idea as ive had issues with multiple uplinks going from MX to LAN if spanning tree isnt working properly.
@NSGuru wrote:The MX/s do not participate in spanning tree so it may be possible it was the cause of your issue. Just an idea as ive had issues with multiple uplinks going from MX to LAN if spanning tree isnt working properly.
Meraki Support told me they don't support multiple links from one switch to the MX84. They said a single uplink from multiple switches is fine. Having just 1 switch with a connection to the MX84 creates a SPOF, right?
I have had issues like this with 9.x firmware, multiple times, but not with the 10.x firmware.
This is probably cold comfort, but I always strive for a design as loop free as possible. In this case if you used MS210 switches you could stack them together. You could then use LACP to all the down stream switches. Then the only loop that would be left in the network would be the MX84 itself. This is likely to be a lot more solid.
@PhilipDAth wrote:I have had issues like this with 9.x firmware, multiple times, but not with the 10.x firmware.
This is probably cold comfort, but I always strive for a design as loop free as possible. In this case if you used MS210 switches you could stack them together. You could then use LACP to all the down stream switches. Then the only loop that would be left in the network would be the MX84 itself. This is likely to be a lotmore solid.
I have been suggesting stackable switches for some time. Here in our HQ, we use stackable 3850's for our access switches going to 2 Catalyst 6807 switches via 10Gb links. someone else here supports the remote offices that we're putting this Meraki gear in and he had the initial design discussion with them. We even had the same discussion as to what you're recommending with the 2 stacked switches as the distribution layer and 120's for Access.
I tried replicating the issue this morning following the steps Meraki Support suggested, but I couldn't replicate it. I noticed that the uplinks from Switch 1 and 7 were both in forwarding mode so I connected a patch cable from switch 7 to switch 1. This caused switch 7 to put its connection to the MX84 in blocking mode since switch 1 is the root. I ran a test to ensure STP "failover" between Switch 1 and Switch 7 and back by doing the following:
I don't know if this will prevent that broadcast storm that Meraki said happened, but it will ensure that both switch 1 and switch 7 don't have their uplinks to the MX84 in forwarding mode at the same time.
you know it'd really be assome if Meraki would give us visibility under the hood. CPU utiliazation, temperature sensonsrs ..ect.
You could have set an alert to notify you when CPU utiliazation went above 50 or 60%. might have prevented your network from going down entirly if you could have gotten to it early.
I wish Meraki would implement ALL features of the cisco IOS platform into their switches.
It will happen down the road, im just thankful its pretty easy to use Meraki gear for ease of management and out of box visibility has helped me out tremendously. but Cisco's new DNA center brings a Meraki ease to Cisco IOS but at a higher cost and subscriptions.
Also Maybe look into the SNMP MIB and that may get you CPU/TEMP info through SNMP. not as nice as it could be in the dashboard but PRTG has an integration
https://kb.paessler.com/en/topic/59986-help-monitoring-meraki-network
Hi,
I'm sorry but it's an MX84 hardware issue that cannot be patched.
If you have the possibility to try with an MX100 it would be OK.
I've spend many month with Meraki engineer and product manager to troubleshoot this and Meraki conclude to this.
We even tried with hot spare.
I change MX84 by MX100 without touch to the LAN or architecture and it's working perfectly.
We don't deploy any MX84 in these kind of environnement.
Regards,
Stephane F
We swapped out our MX84's for MX100's and it happened with 2 MX100's in an HA configuration. Meraki does not recommend a direct connection for heartbeats between the 2 MX devices any longer. We now have 2 MX100 devices with the HA one powered off. This is ridiculous.