We have a large network (200+ MX's) and provide a managed service to our customers, So we try to proactively respond to customer issues.
Every site has a primary internet provider as well as a cellular modem for fail over. (not load balanced)
We recently had to disable the VPN status alerts because any minor fluctuation in either the primary or the cellular would trigger an alert email, we would get thousands everyday, meanwhile no meaningful disruption to the VPN.
Our current challenge is the "Uplink Status Change" Same thing here, seems to be mostly noise. Switched to Internet2 and back to Internet1 within moments(usually 1 - 15 mins). Again customer doesn't notice any meaningful disruption. And this creates noise for the techs and we miss when things go down and don't come back.
Recently we discovered a couple sites running on cellular for 30+ days (The failed WAN also disappears completely from the dashboard after some time, but this is a different issue)
And finally the client monitoring alerts, The MX doesn't seem to actively monitor the client devices, instead only puts it "Online" if it has sent a packet to the MX recently.This is not particularly useful for devices that primarily communicate on the LAN (AP's, Switches etc...) as they just show as "Offline" all the time.
Also if the client legitimately reboots a machine, we get the alert immediately, and another when it comes up. This is creating noise and causes us to miss issues when a machine goes down and doesn't come back.
This is frustrating for our clients as we miss issues we are supposed to be managing, and is frustrating for our techs as they may be required to follow up on meaningless alerts, often after hours costing us OT $
Ideally we would have some sort of threshold. E.g. if we are on WAN2 for 30+ mins, Then notify a technician If a device goes offline 30+ mins, Notify a tech etc.... Similar to the threshold for the "MX Offline" notification
Does anyone have any tips on this ? Besides scrapping the Meraki monitoring all together and investing in an enterprise monitoring system.