MX100, Strange DHCP events/storm after V15.42.1 upgrade

Spack
Getting noticed

MX100, Strange DHCP events/storm after V15.42.1 upgrade

We have an MX100 security appliance.  We have exclusively Meraki switches and WAP's.  Within 8 hours of upgrading to 15.42.1, our network started to bugger out.  Log inspection reveals all DHCP servers are releasing all DHCP addresses on all VLAN's regardless of where the DHCP server is.  We have 5 networks served with Meraki DHCP and one served with Microsoft DHCP. (server 2016, v1607, up to date on patching). Does anyone know of anything that would cause all DHCP servers to flush themselves and re-ask everyone to get new IPs?

All networked devices with DHCP addresses lose connectivity, including office phones.  When the storm is over and all DHCP's addresses are instructed to be released, the meraki puts a little cherry on top and says "Oh look I found a rogue DHCP server!" except it is not.  It is allowed and configured as an allowed DHCP server.  Here is a log cut.

rogue 1rogue 1I can't find anything strange in the logs that would cause this to occur.  I'd love to hear your thoughts!

 

17 REPLIES 17
PhilipDAth
Kind of a big deal
Kind of a big deal

What is acting as the DHCP server?

 

What is your DHCP lease time?

For all VLANs except 2, the Meraki is the DHCP server. 

The Meraki lease time is 8 hours. 

For VLAN 2, a Windows Server 2016 machine is running DHCP. 

The Windows lease time is 8 days.

PhilipDAth
Kind of a big deal
Kind of a big deal

The problem happened after 8 hours, and you have an 8-hour lease duration.

 

I suspect the power cycle caused devices to refresh their DHCP lease at the same time, making them all fall due at the same time, or perhaps the upgrade caused the issue date to be set to the same time making them all fall due at the same time.

 

Either way, I don't think you have a problem.

It is happening about 3 times a (business) day, causing everything on the network to go dead for about 1 min.  yesterday it happened 5 times during the business day.  Today it has happened 3 times.  Everything, including all my IP phones and people using their computers for zoom calls... all get interrupted for about a min.

It's is a problem and it is interfering with people's ability to do their jobs.

And it means that people have to talk to me.  I don't like it when people talk to me...because that means things are broken...


Thanks,


Thomas

 

PhilipDAth
Kind of a big deal
Kind of a big deal

Is it possible to increase the lease time to 24 hours to see how the problem changes?

The other thing you could try to "reset" DHCP on the MX is to turn DHCP off, save, and then turn it back on again (sand save).  It will keep all settings, but dump the DHCP database.

Gotcha.    That is a little more difficult.  But yes, I will keep that in mind.

cmr
Kind of a big deal
Kind of a big deal

@Spack I'd also try turning off the options to control rouge DHCP servers (for testing)

Spack
Getting noticed

The only thing it is doing is supposed to be doing is sending an email when it detects rogue dhcp. But I’ll turn it off. 

cmr
Kind of a big deal
Kind of a big deal

@Spack, I was referring to to Dynamic ARP Inspection.  You may already have it disabled (default) here, but if not then the ports connected to the MXs should be set as trusted:

 

cmr_0-1621875293520.png

 

 

Spack
Getting noticed

It is in fact disabled.  I think we're just going to try rolling it back.  Meraki has not responded at all to my case opened Friday.  I'm not quite sure what to make of that.

cmr
Kind of a big deal
Kind of a big deal

@Spack I always log a ticket and if urgent call Meraki, they are usually then very helpful.  The wait time for me has never been more than 10-15 minutes.

Spack
Getting noticed

We rolled it back.  No issues since then.  So it absolutely was the firmware update.

@cmr I noticed that they took all the urgency options out of the create a ticket screen. 

I ASSumed <cough> that meant they would triage the tickets better...  My bad.

 

That being said, they emailed me at 3PM my time today.  I replied and am waiting for their reply.

 

Absolutely.  I'll do that right now.

Spack
Getting noticed

Set to one week for all Meraki-managed DHCP services.

Lease duration was 12 hours.  Can't have 8 hour leases in the meraki portal.  My bad.  Thanks for the suggestion.  At least the community responds.

Spack
Getting noticed

Still happening too. The only other recent charge was the installation of a mr44 wap

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels