If you use the async IO version of the library to run the API call on all devices 'at once' (not really at once, there's rate limiting) you should see a significant speed up.
I've got scripts getting stats this way across hundreds of networks, thousands of Meraki devices, and getting end-user device data usage for tens of thousands of clients,
It's way faster than iterating through one by one. I do wrap the calls in some extra detect/back-off/retry as the built-in rate limit detect and retry is not always enough.
Bear in mind that Python itself is not the fastest - the same scripts run on a powerful server go a lot faster than the Macbook Pro I develop on, even though the server has much slower Internet access - so another option is develop using something that'll go faster (but maybe lacking the ease of Python.)