WAN uptime monitoring

Messy
Here to help

WAN uptime monitoring

Hello,

 

I've been asked if I can get WAN uptime stats for all our networks.

Ideally, get historical data, but if that's not possible, at least setup something to record live info so we have data going forward.

I cant find any endpoints that look like they have what i need. Am I missing one?

 

thanks

 

10 Replies 10
Mloraditch
Head in the Cloud

These two:

https://developer.cisco.com/meraki/api-v1/get-organization-uplinks-statuses/

https://developer.cisco.com/meraki/api-v1/get-device-loss-and-latency-history/

 

would be what I think you would want. Mostly the first, but the second may provide additional data you want.

 

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
Messy
Here to help

sry only just saw your reply.

for this one i would have to constantly poll every network every minute for ever 
Get Organization Uplinks Statuses - Meraki Dashboard API v1 - Cisco Meraki Developer Hub

Get Device Loss And Latency History - Meraki Dashboard API v1 - Cisco Meraki Developer Hub
this one might be doable, but id have to call with a 1 min resolution which will return a huge amount which i guess isnt an issue.
then i would have to compare the loss % from both mx devices at each time interval to see if there was a downtime or not (both mx's down)

might have to give it a go as dont see another option

Mloraditch
Head in the Cloud

Yeah, it's not the easiest thing to do yourself. Depending on the size of your deployment there may be third party tools like the other commenter posted about that could do it more cost effectively.

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
Messy
Here to help

I just found this - 
/networks/{networkId}/insight/applications/{applicationId}/healthByTime

[ { "startTs": "2018-02-11T00:00:00Z",

"endTs": "2018-05-12T00:00:00Z",

"wanGoodput": 20000000,

"lanGoodput": 100000000,

"wanLatencyMs": 10.1,

"lanLatencyMs": 3.2,

"wanLossPercent": 0.2,

"lanLossPercent": 0,

"responseDuration": 210,

"sent": 1000,

"recv": 5000,

"numClients": 2 } ]

In theory i could call this for a certain time resolution and then look at the wan loss % to decide if it was up or down. Seems like a very clunky way of doing it tho 😕 

going to be a metric crap-ton of calls

MartinS
Building a reputation

Yes it's a lot of calls and it quite a lot of work to sort the data out and report on it cleanly. This is where I'm going to plug the Meraki Marketplace as there are a number of us Ecosystem partners who can do this for you in a matter on minutes:

Cisco Meraki Marketplace

Declaration of interest - I work for www.highlight.net 

---
COO
Highlight - Service Observability Platform
www.highlight.net
Messy
Here to help

on further investigation, my above idea sucks.

I just noticed this in the dashboard...

Messy_0-1737559955607.png


so downtime metric DOES exist!! and there even is a download button to get the info in a csv....if I could just download those programmatically that would be great.

I cant find a relevant api endpoint tho 😞

mlefebvre1
New here
Messy
Here to help

unfortunately that endpoint doesn't seem to work. I did a quick test from the dev portal and i just get 
"Failed to fetch"

At the moment I am using /organizations/{organizationId}/devices/uplinksLossAndLatency
from platform - devices - uplinks. 

It returns this;


{ "networkId": "N_24329156",

"serial": "Q234-ABCD-5678",

"uplink": "wan1",

"ip": "1.2.3.4",

"timeSeries":

                    [ { "ts": "2019-01-31T18:46:13Z",

                  "lossPercent": 5.3,

                  "latencyMs": 194.9 }

which is nice - i get all data for every device and wan link for every network all in 1 single call...buuuuut
on checking the data, I noticed it was saying 0% loss for a netowrk that had been down for 3 hours 😞

so not sure its reliable. 

MartinS
Building a reputation

If the device is down, the dashboard can't get the data for the test. It's not safe to assume a device that the Meraki dashboard can't contact is experiencing 100% packet loss on any/all of its uplink tests, so you'll get whatever the last result before contact was lost.

---
COO
Highlight - Service Observability Platform
www.highlight.net
sungod
Kind of a big deal
Kind of a big deal

I use the getOrganizationDevicesAvailabilitiesChangeHistory endpoint, it does work.

 

It has two separate documentation entries, one in Early Access, I'd guess the endpoint is being tweaked.

 

https://developer.cisco.com/meraki/api-v1/get-organization-devices-availabilities-change-history/

https://developer.cisco.com/meraki/api-v1/api-reference-early-access-api-platform-monitor-devices-av...

 

Using device availability is the option I'd recommend.

 

You do need to periodically grab the status 'now', then you can use the change history to calculate from that point.

 

This gets current status, updated every 5 minutes...

 

https://developer.cisco.com/meraki/api-v1/get-organization-devices-availabilities/

 

 

If you only want to check site reachability [defined as an MX responding], you might find it simplest to just ping your sites every 60 seconds.

Get notified when there are additional replies to this discussion.