Meraki

PhilipDAth · ‎Apr 30 2020

I always have a dilemma when choosing between Python and node.js.

I kinda look at Python as something you use when you just want to "hack something out" and you don't care about the performance very much. I tend to use it more for batch style operations, or things that you want to do in a clearly defined set of steps in a specific order (e,g. provisioning a new network). In a simple step by step world, Python is easy to work with.

Indeed, node.js often requires additional effort to make things happen in a specific order.

I like using node.js either for server-side processing (such as providing an API), or when you have a lot of jobs you can do in parallel that are IO bound, and you don't care about the order they are done in (such as doing a task on every single network).

The async support in Python has thrown a bit of a spanner in the works for my thinking. It gives Python similar capabilities to node.js. And Meraki gives far more attention and support to Python. The Python SDK seems to get a lot more love.

Indeed, I think all POST API requests are still broken in the node.js library (I reported that quite some time ago along with the fix ...) forcing the use of the mega proxy. The node.js SDK is using the requests library, which is no longer in active development and which people are being encouraged to no longer use. So the node.js SDK is a poor cousin with little to no active development.

Out of curiosity, could someone post the download numbers (for say the last month) for each of the SDKs? It would be interesting to see what the percentage split is between them. For example, is the Python SDK making up 90% of the downloads?

PhilipDAth · ‎May 1 2020

I have a need to download the list of networks for around 200 orgs and store them in a database for subsequent processing.

So I wrote just the Meraki side in both Python (using AIO) and node.js.

The Python version was the easiest to do by far. I set the concurrent IO option to 200. It took 120s to run.

The node.js version took me hours to do getting the promises all chained together nicely. However, it took 6s to run. It averaged 30 API calls per second. If I used a faster machine it would be faster still.

So 120s versus 6s.

So I guess that kinda confirms my thoughts again. Python is best if you just want to hack something and don't need to run it again or you don't care about performance.

If you care about performance then use node.js.

Greenberet · ‎May 6 2020

I think your conclusion is wrong here. I havn't used meraki with node.js yet, but I thinkt that there are some differences here on how ther meraki modules are handling the ratelimit in python and node.js

How did you configure the rateLimiter in node.js?

Are you using scopes in node.js?

the current async implementation in python uses a simple concurrt request counter, and if there are too many requests open, then it will wait for 0.3 seconds and check again (maybe we should lower that value).

the nodejs meraki module has the option of a scope, so the concurrent counter gets counted for each scope. I had a similiar idea, when I've implemented the counter. My problem here was, that I wanted to do it automatically, but you don't have the "orgId" in each request, so i've abonden the idea. I never thought of letting the programmer add it via "scope".

At the moment you could achieve something similiar with creating the AsyncDashboardAPI object for each organization.

You could also speedup your dependencies: just run

pip install aiohttp[speedups]

and see if the script runs faster now.

PhilipDAth · ‎May 6 2020

The node.js SDK has no rate limiter. node.js is a greedy async engine, it starts executing stuff as soon as you request it. Python is a lazy async engine, it doesn't start processing async requests until you call the event loop.

I'm not using scopes in node.js - I don't even know what they are.

It's hard to argue with it taking 120s in Python and 6s in node.js ...

I put this down to the node.js using a greedy async engine.

Greenberet · ‎May 6 2020

@PhilipDAth wrote:
The node.js SDK has no rate limiter. node.js is a greedy async engine, it starts executing stuff as soon as you request it.

the meraki module in nodejs uses "bottleneck" in the background -> https://www.npmjs.com/package/node-meraki

@PhilipDAth wrote:
Python is a lazy async engine, it doesn't start processing async requests until you call the event loop.

that purely depends on how you are using it. When you are using a simple "await" then it is like that yes, but you could also use asyncio.ensure_future instead.

PhilipDAth · ‎May 6 2020

>the meraki module in nodejs uses "bottleneck" in the background -> https://www.npmjs.com/package/node-meraki

The node.js getting started guide:

https://developer.cisco.com/meraki/api/#/node-js/getting-started

references this package instead:

https://www.npmjs.com/package/meraki

So I am hoping this is the official SDK rather than the other one.

When I search the github repository neither the word rateLimiter nor bottleneck appear.

https://github.com/meraki/meraki-node-sdk

So no rate limiting being done.

PhilipDAth · ‎May 6 2020

I see ages ago someone filed an issue on github about the node.js repo having incorrect URLs in it - and I see it still does.

It is almost like the node.js SDK has been abandoned by the team.

PhilipDAth · ‎May 6 2020

Oh variable scopes.

Almost everything in my code is locally scoped to the function. The only global variables I had where the Meraki SDK instantiation.

PhilipDAth · ‎May 6 2020

ps. I've repeatedly had this experience before as well - with node.js being either as fast or substantially faster than Python when do IO (pretty much any kind, http, database, etc).

PhilipDAth · ‎May 6 2020

This is the test Python code I used. Do you see a way I could speed it up by more than a factor of 10?

import json
import asyncio;
import meraki.aio;

async def processNetworks(aiomeraki: meraki.aio.AsyncDashboardAPI, org):
	try:
		networks = await aiomeraki.organizations.getOrganizationNetworks(org['id'])

		for net in networks:
			print(net["name"])
	except meraki.AsyncAPIError as e:
		print(f"processNetworks: Meraki API error: {e}")
	except Exception as e:
		print(f"processNetworks: error: {e}")


	return;

async def main():
	async with meraki.aio.AsyncDashboardAPI(
		base_url="https://api-mp.meraki.com/api/v1",
    suppress_logging=True,
		maximum_concurrent_requests=200
	) as aiomeraki:
		# Get list of organizations to which API key has access
		orgs = await aiomeraki.organizations.getOrganizations()

		try:
			for org in orgs:
				print(org['id']+" "+org['name']);
		except (Exception) as error:
			print("main: Error while connecting to PostgreSQL", error)
		finally:

		orgTasks = [processNetworks(aiomeraki, org) for org in orgs]
		for task in asyncio.as_completed(orgTasks):
			await task;

	return;

def lambda_handler(event, context):
	loop = asyncio.get_event_loop()
	loop.run_until_complete(main());

	return {
		'statusCode': 200,
		'body': json.dumps('Finished.')
	}

Greenberet · ‎May 8 2020

I've found the issue.

Believe it or not it's an nginx/configuration issue.

You are not hitting the api rate limit here. Instead you are hitting the rate limiter of the nginx proxy.

With the default configuration the python api will wait 60 seconds for the next retry.

I've created 200 test orgs with 10 networks.

When I set the nginx_429_retry_wait_time & retry_4xx_error_wait_time to 1 the whole run took 22 seconds with maximum_concurrent_requests=3 and ~6 seconds with maximum_concurrent_requests=200 with your test methods.

@chengineerI know this parameter is set to 60 seconds as a protection to the nginx, but do you think it would be possible to decrease it per default?

From a script performance perspective it is a lot faster to hit the nginx & shards with as much concurrent requests as possible and just handle the rate limits (nginx 1 seconds, shard whatever will be returned in the header), than limiting the concurrent requests to 3.

This is a sample code I've used for my tests

async def processNetworks(aiomeraki: meraki_v1.aio.AsyncDashboardAPI, org):
    try:
        networks = await aiomeraki.organizations.getOrganizationNetworks(org["id"])
        netTasks = [aiomeraki.networks.getNetwork(net["id"]) for net in networks]
        for task in asyncio.as_completed(netTasks):
            n = await task
            print(n["name"])
    except meraki_v1.AsyncAPIError as e:
        print(f"processNetworks: Meraki API error: {e}")
    except Exception as e:
        print(f"processNetworks: error: {e}")
    return

async def max10():
    async with meraki_v1.aio.AsyncDashboardAPI(
        api_key=api_key,
        base_url="https://api.meraki.com/api/v1",
        suppress_logging=True,
        maximum_concurrent_requests=10,
        nginx_429_retry_wait_time=WAIT,
        retry_4xx_error_wait_time=WAIT, maximum_retries=200
    ) as aiomeraki:
        # Get list of organizations to which API key has access
        orgs = await aiomeraki.organizations.getOrganizations()
        for org in orgs:
            print(org["id"] + " " + org["name"])

        orgTasks = [processNetworks(aiomeraki, org) for org in orgs]
        for task in asyncio.as_completed(orgTasks):
            await task

Very interesting are the speed results here:

Spoiler

max3 took 159.34725640000002 seconds
max10 took 65.28876129999999 seconds
max50 took 39.83993439999999 seconds
max200 took 29.977361000000016 seconds

max3 took 159.34725640000002 secondsmax10 took 65.28876129999999 secondsmax50 took 39.83993439999999 secondsmax200 took 29.977361000000016 seconds

The wait time parameters were set to 1 second on each test run.
The numbers after "max" are the maximum_concurrent_requests

PhilipDAth · ‎May 8 2020

Also thanks for the tip on the bottleneck module. I have started using it. I use one per org I am accessing (so if I am accessing 200 orgs I use 200 bottlenecks).

So now Python is only 5 times slower than node.js ... still a pretty big performance penalty.

Greenberet · ‎May 9 2020

Actually it is the same speed as node.js your python code takes also about 6 seconds.
I've made an additional code to get the network. This created the additional seconds (200x10 requests)

chengineer · ‎May 12 2020

so the request is to change both NGINX_429_RETRY_WAIT_TIME as well as RETRY_4XX_ERROR_WAIT_TIME to something lower than 60?

Solutions Architect @ Cisco Meraki | API & Developer Ecosystem

Greenberet · ‎May 12 2020

Exactly.

maybe we should also change it to a time range.

E.g. 1 second up to random(concurrent_requests_counter/2)

The Problem with a fixed value is that you are just postponing the next rate limit.

E.g 100 requests are getting sent at once.

10 are going through -> 90 requests would currently be send 60s later -> 10 are going through.....

So you are hitting the rate limit again and again and again.

With a range we would spread the requests over the time period and not hit the rate limit all the time (even with an increased concurrent_request limit)

chengineer · ‎May 13 2020

Thanks for the suggestion guys. These changes have been added to releases 0.110.1 & 1.0.0b4 as of today.

Solutions Architect @ Cisco Meraki | API & Developer Ecosystem

Meraki

Community

Downloads of each developer kit?

Downloads of each developer kit?