Error: 429 Client Error: Too Many Requests

Adrian4
Head in the Cloud

Error: 429 Client Error: Too Many Requests

Hello,

Recently I have been running into this unexpectantly. If I wait 5 or 10 mins it goes away but it kept happening so I ran the API requests summary endpoint each time it happened and I saw thousands of requests with a 429 response.

Since 429 is the too many requests error, I assumed a first call 429 and then it re-try's a thousands times immediately after until it gives up which is what gives me all those 429?


I tried to dig deeper into where they were coming from....

Adrian4_0-1693553890708.png

 



all the 429 requests were apparently getNetworkWirelessClientCountHistory ?!? - that doesn't make any sense as i only ran that once and it was successful.

also - in the thousands, they seem to alternate from ssid 4 to ssid14 over and over - not sure what that is about as I am not jumping around SSIDs and the one im connected to isnt 4 or 14.


can anyone shed some light?

27 Replies 27
sungod
Kind of a big deal
Kind of a big deal

Assume you are using the Meraki Python library.

 

Are you telling the library to wait on retry? This is important to avoid retries being made too fast.

Combined with setting a generous retry limit should ensure calls eventually complete ok.

 

        wait_on_rate_limit=True,
        maximum_retries=100

 

For instance...

    async with meraki.aio.AsyncDashboardAPI(
        api_key=API_KEY,
        base_url='https://api.meraki.com/api/v1/',
        print_console=False,
        output_log=False,
        suppress_logging=True,
        wait_on_rate_limit=True,
        maximum_retries=100
    ) as aiomeraki:

 

If you are using some different code to handle 429s, make sure it is obeying the Retry-After header in the 429 response, see https://developer.cisco.com/meraki/api-v1/rate-limit/

 

Adrian4
Head in the Cloud

thank you! to be honest I had never come across 429 errors until a day or two ago 😛

PhilipDAth
Kind of a big deal
Kind of a big deal

I haven't checked - but are you saying wait_on_rate_limit is not True by default?  I'm gobsmacked.

sungod
Kind of a big deal
Kind of a big deal

Tbh I think it defaults to on, but decades of coding paranoia leads me to always set it explicitly in my scripts 😀

PhilipDAth
Kind of a big deal
Kind of a big deal

I checked, and wait_on_rate_limit is on by default.

https://github.com/meraki/dashboard-api-python/blob/main/meraki/config.py#L19

and maximum_retries defaults to 2.

https://github.com/meraki/dashboard-api-python/blob/main/meraki/config.py#L37 

 

And for anyone finding this using Google, here are all the Meraki Python library defaults:

https://github.com/meraki/dashboard-api-python/blob/main/meraki/config.py 

Adrian4
Head in the Cloud

the problem has returned 😞

every so often getting 429 errors and when I check the summary I see this  "429": 1307,

So two things - one, when it gets a 429, it is clearly re-trying over thousand times in tiny time frame - 
not sure I am using these variable properly

wait_on_rate_limit=True,
maximum_retries=100

Am I supposed to pass them as part of the API headers or something?

Secondly, I must still be doing something to trigger the 429 rate limit in the first place. The limit is 5 requests per second I think? After each and every request call I have put in a 0.3 second delay which should mean I cant be sending more than 4 a second so how is this happening?

To produce the delay I am using the time module and setting a global variable of

DELAY = 0.3

then after every request putting the line 

time.sleep(DELAY)

 

networks = requests.get(networks_url, headers=headers, verify=f"{root_path}/Cisco Umbrella Root CA.crt")
time.sleep(DELAY)
sungod
Kind of a big deal
Kind of a big deal

Are you using the Meraki Python library?

 

 

Adrian4
Head in the Cloud

Im using "requests" HTTP library

sungod
Kind of a big deal
Kind of a big deal

Then you are not using the Meraki Python library.

 

These set behaviour of the library, if you are not using it then they will not help you.

 

wait_on_rate_limit=True,
maximum_retries=100

 

I recommend using the library.

 

If don't want to use it, you will need to do as suggested above and make sure you code to obey the Retry-After header in the 429 response, see https://developer.cisco.com/meraki/api-v1/rate-limit/

 

For instance based on the example on the linked page...

response = requests.request("GET", url, headers=headers)
​
if response.status_code == 200:
    # Success logic
elif response.status_code == 429:
    time.sleep(int(response.headers["Retry-After"]))
else:
    # Handle other response codes
Adrian4
Head in the Cloud

Hi,

Thanks for the reply! makes sense.

However, the 429 error means iv already hit the rate limit, so adding a sleep timer on that code will help reduce the amount of time i have to wait before i can send again (by preventing the re-try spam) but I still have the issue of hitting the limit in the first place 😞

sungod
Kind of a big deal
Kind of a big deal

The rate limit is just a fact of life, but if you use the back-off and retry mechanism your calls will eventually succeed.

 

For example, I have scripts using the Meraki Python library aio functions that could potentially make thousands*** of calls 'at once', but the rate limit and retry handling kicks in, things back off and retry, resulting in throughput at the rate limit, the library handles it all for me.

 

The alternative is to set timers and not issue calls any faster than the limit, but that is unreliable as someone else may make calls on the org at the same time as you and then rate limiting can still occur.

 

For some purposes you can use action batches, giving higher throughput...

https://developer.cisco.com/meraki/api-v1/action-batches-overview/

 

 

***if I am doing something that will generate high call volumes, I generally split it into smaller chunks to avoid just hammering on the org.

RaresPauna
Getting noticed

I am using the meraki library, with an semaphore of 10 conccurent workers on top. Even so, when i check the number of calls, the 429 are hundreds. I don't know how to stop it. Once it start getting them, it seems it's never stopping

ls08
Here to help

Put your code in chatgpt or any other AI and ask it to help continue the script with 429 errors. It can adjust the script to add some back off timers in between calls. Keep trying until you have a script running without any interruptions.  It will error out and it will give you 429, but the script will continue to run until completed. 

I’ve done this with the meraki and other APIs

RaresPauna
Getting noticed

    async def getNetworkAppliancevlansMeraki(self,id😞
        async with self.semaphore:
            try:
                async with meraki.aio.AsyncDashboardAPI(self.api_key,self.base_url,output_log=False,print_console=True,suppress_logging=True,wait_on_rate_limit=True,maximum_retries=10) as aiomeraki:
                    vlans = await aiomeraki.appliance.getNetworkApplianceVlans(id)
                    return vlans

            except meraki.AsyncAPIError as e:
                if e.status == 429:
                    retry_after = int(e.response_headers.get('Retry-After', 1))
                    log.info(f"Rate limit exceeded. Retrying after {retry_after} seconds.")
                    await asyncio.sleep(retry_after)
                else:
                    vlans= []
                    log.info(f"Couldn't get Vlans , error: {e}")
                    return vlans
Even if i see hundreds of 429 calls, nothing raises my exception
sungod
Kind of a big deal
Kind of a big deal

The Meraki Python library call does the wait on rate limit itself.

 

You specify maximum retries 10, so as long as the library call needs fewer retries you would not get a 429 exception.

 

I.e. ideally you should not need to do the extra wait/retry in your exception handler.

 

At a high enough rate, I think the 429 mechanism breaks down, so if I know I'll be exceeding the rate limit I tend to add code to limit concurrent calls and/or add some additional back-off/retry that increases retry time exponentially.

 

Oren
Meraki Employee
Meraki Employee

Your script may be using 10 calls per second, but there might be other integrations/script consuming more calls at the same time - resulting in the organization exceeding the rate limit. 
You can use getOrganizationApiRequests to examine who/what else is making API calls to the organization.

RaresPauna
Getting noticed

Thanks for the input, that's how i'm already checking the numer of calls by status code. All are provided by my script thst generates an output of aprox. 500 calls(200+429) for an organization with 120 networks, so the number of calls should be 120. My guess it's that meraki's library parameter wait_on_rate_limit and maximum_retries are causing many duplicates that are unnecessary. May the problem be that i initialize the meraki.aio for every call and it is unaware that it has other instances that are already doing the same thing

sungod
Kind of a big deal
Kind of a big deal

Initialising aio per-call sounds likely to cause problems.

 

This sort of thing is how I generally do concurrent calls with aio, here on a list of networks, each instance of the called function then each does a request on one network.

 

# process eligible networks concurrently
networkTasks = [processNetworkAppliances(aiomeraki, net) for net in applist]
for task in asyncio.as_completed(networkTasks):
    await task

 

This and all other aio stuff happens inside a single async block...

 

async def main():

    # any set-up stuff here

    async with meraki.aio.AsyncDashboardAPI(
        api_key=API_KEY,
        base_url='https://api.meraki.com/api/v1/',
        print_console=False,
        output_log=False,
        suppress_logging=True,
        wait_on_rate_limit=True,
        maximum_retries=100

    ) as aiomeraki:

        # everything using aio goes in here

    # aio stuff is finished, do whatever happens next, if anythng

 

RaresPauna
Getting noticed

I don't see any kind of semaphore in your script. Using asyncio.as_completed should start all of them at once, so if your organization has 30 networks you start 30 workers and hit the rate limit. Am I wrong with this logic?

sungod
Kind of a big deal
Kind of a big deal

Never bothered to use semaphores. The wait/retry is generally enough.

 

But as I said in another post, if I know there'll be large numbers of concurrent calls, I just break the list into smaller chunks to avoid excessive load on the service.

 

It's a simple method and works at scale.

PhilipDAth
Kind of a big deal
Kind of a big deal

That is not how it works.  The Meraki asyncio internally uses a rate limiter of 10.  It will submit no more than 10 API calls at a time.

ls08
Here to help

I really wouldn't suggest changing the code to anyone. Requests Retry-After will give you the identical results. I've built all my python codes using Meraki library and when I tried converting it to Requests with the IF statement I still get the same error 429 with a Retry-After in 59 or 60 seconds. Nothing greater or less and I had a max retry of 10. I thought  the Retry-After would give off random seconds, but it was consistently 59 or 60.

 

Because of this result, I'm adding time.sleep() with a random integer in between API calls and use this setting for my dashboard.

dashboard = meraki.DashboardAPI(API_KEY, suppress_logging=True,wait_on_rate_limit=True,maximum_retries=10)

 

 

Supposedly Meraki is looking into increasing the API calls to 50, but this was in early Alpha stages when I asked my rep.

Oren
Meraki Employee
Meraki Employee

Question to the group. When you hit 429, how do you investigate/troubleshoot/workaround it?

PhilipDAth
Kind of a big deal
Kind of a big deal

I'm going to start a new reply.  The asyncio uses a ratelimiter by default.  They way this works is it submits all 10 API calls immediately, and then as one request finishes it dispatches another.

 

When I have high levels of concurrency I have had more luck using a Throttler.  If you have 10 API requests pending, it spaces them out in 100ms intervals.  I use code like:

 

import asyncio,meraki.aio,throttler
...
async with meraki.aio.AsyncDashboardAPI(
	output_log=False,
	print_console=False,
	maximum_retries=100,
	wait_on_rate_limit=True
) as dashboard:
dashboard._session._concurrent_requests_semaphore = throttler.Throttler(rate_limit=4, period=1.0)
...
RaresPauna
Getting noticed

I had absolutely no idea about ._session._concurrent_requests_semaphore. So this by default submits 10 calls?

Where you found it?

Thanks

PhilipDAth
Kind of a big deal
Kind of a big deal

How did I find it?

 

Something I would have been doing would have been broken.  I go look at the source code for the Meraki SDK to understand how it works.  I then consult Google.  I then try 90 different approaches and fail.  I then get it working and spend the next 10 iterations tidying it up and working nicely.

 

That's how I found it.

PhilipDAth
Kind of a big deal
Kind of a big deal
Get notified when there are additional replies to this discussion.