That results into a parsing and graphing mess in PRTG.
The issue in the last example seems to be the 'timespan' parameter, notice that it returns the value for two periods of time, 1 minute duration each. That's confirmed in the dashboard as the uplink 'Historical Data' graph seems to work in 1 minute intervals.
My first thought was that the 90 seconds timespan will occasionally return two values, one for the past 60 seconds and one for the 60 seconds before it. I now tried requesting a timespan of 60 seconds instead but i am seeing the same issue where it will often return nothing. That stops PRTG from plotting a smooth graph similar to the one on the dashboard. Interestingly, the issue applies to both WAN links, they will both either return something or nothing.
Nothing else is querying this API so we are definitely not hitting the 5 calls per second limit.
I could play around with a longer timespan or the t0 and t1 values but i would like to monitor this in a way where i can spot latency spikes, ideally so it looks similar to the graph on the dashboard.
Anyone knows what pattern this follows?
It seems to be clearing the values every now and then and eventually replacing them with the most current ones, then adding to these.
Some experiments i've done:
-- timespan=300 usually returns 4 values but something 5.
-- timespan=180 sometimes returns 1 value, other times 3, some other times 2 which makes no sense.
-- timespan=180 and above seem to ALWAYS return at least 1 value.
It's actually returning a 200 with an empty table, all i see in Postman is:
Was wondering if someone can test this with a timespan of somewhere between 60 and 90 seconds and confirm.
Just before i left the office i tried requesting for org MX latency and loss** instead of a single device with a high timespan, and noticed that the timestamps in the responses came out of order but also were sometimes within 1 second of each other, others within 10 seconds. Trying to figure out if there's any pattern here or the API is queueing/caching data and responding on it on a random fashion.
Looking at the OP's findings, I'd say the variable number of response samples is just a side-effect of a too-short sampling window (timespan) for a set of events (default 60 second duration) that aren't synchronised with the sampling window, this seems in line with seeing 4 or 5 with a 300 second timespan.
If you need consistent 60 second samples, try making requests every 2-3 minutes with a t0 in seconds of (now-300 seconds)%60 seconds (to align t0 to hh:mm:00) and t1 = t0+300, add new samples to a list, discard or overwrite duplicates.
That should allow plotting without gaps, the only cost is that things will lag a couple of minutes, though you could reduce that by making the API call at a higher rate, say every minute.