I had a very interesting problem today that I wanted to share.  I recently migrated a customer from a very close to default catalyst 3750x switch running L3 to MS425 running L3.  Migration went well, no real issues to speak of, until 2 weeks later.  We were performing a Netscaler (ADC) fail over, and the fail over failed.  When the Netscaler(ADC) failed over, nothing could reach the ADC.  As you know there isn't much of a CLI to really dig into, but once I dug into how fail over was configured on the netscaler and read a few Meraki and Netscaler articles, opened a Meraki TAC case, I figured it out.  The default fail over methodology for netscaler uses "GARP" Gratuitous ARP, each Netscaler in the cluster has its own discrete MAC address, and when the ADC fails over, the new active ADC sends out a Gratuitous ARP to let the Meraki know that it now owns all the IPs.  


Here's where the problem lives:

The ADC by default, sends its Gratuitous ARP with an operational code of 0, which means its a Gratuitous ARP "REQUEST".  Meraki MS Switching at L3 will ONLY react to a Gratuitous ARP with an Operational Code of 1, which means its a gratuitous ARP REPLY.  Code 0 and Code 1 can be seen in a wireshark capture.   The Fix was one of 2 things, reconfigure the ADC to send Gratuitous ARP "REPLY" instead of REQUEST, or change the Failover methodology to use vMAC (basically using a floating virtual MAC address), like a firewall in an HA pair.


Here's the Citrix Netscaler Document, that outlines the "GARP" configuration, Meraki has NO online documentation explaining how the MS switching line reacts to Gratuitous ARP.  I opened a ticket, and the TAC support person told me that the Meraki MS switching line only reacts to a REPLY and not a request.  We submitted a feature request to make the MS switch respond to both a REPLY and a REQUEST Gratuitous ARP.


I hope this info helps someone out there.  Happy MeRaKiNg!

Great one, thanks for sharing!

Did you end up getting this resolved? I am especially curious if Meraki acted on your feature request and made it so the switch would respond to both the REPLY and REQUEST GARPs.

Great catch there!

I know that this has not been resolved, and I've also asked Meraki to do something about it, but I do not hold out much hope after all this time.  The reason, the GARP process can be used maliciously.  However, Cisco CAN do this!  Could they not work together to find a solution? Microsoft say the solution is to talk to your switch vendor!!!

The problem is, there are so little complaints on this topic, they won't likely fix it.  Since there are several work arounds, no one seems to complain.  I did come across this a few more times since I made the original post, I just changed the ADC configuration to work with the switching.  Cisco is about the big customers, the big clients (the whales), the little people like us get heard, but simply pat on our heads and asked to move along.

