MV API rather useless for mass use in retail analytics: Please implement this
I don't know how important this is to you guys out there, but from the talks that I had, I feel that many people are rather disappointed about what you really can achieve with the MV cameras in the sense of people counting and people detection.
if you don't want to read all the complaining stuff, just skip the next block 🙂
---start of complaining---
Meraki continues to praise the analytics functions for the cameras. The truth from my point of view is, that the analytics features, especially for retailers are rather useless. Don't get me wrong, I am not here to just let out my frustration. I write this post in the hope, that some folks at Meraki @MeredithW@Saralyn read this and start thinking about my suggestions.
I work in retail and I especially spoke to many other customers that do so also. Every one I spoke to stated, that they are not going to replace their existing security cameras with ones from Meraki. BUT IF the Meraki ones would have a significant additional benefit in retail analytics, especially in people counting and people analytics the would instantly go for it.
I was at a Meraki roadshow event in 2019 November and Meraki praised some new MV model by showing a 3rd party dashboard which had counters for people in/out by gender. When we tried to implement this afterwards it turned out, that gender detection will just don't work.
Then I went to a talk with Todd Nightingale and asked him whether Meraki is going to implement some more analytics features into the camera software itself instead leaving all the analytics stuff up to the customers or 3rd parties. He stated, that Meraki is never going to do so, because it's not legal and he will go to jail then. I don't know how Axis does it, but anyways...
He recommended to use the APIs to do analytics by ourselves or use 3rd party software.
Ok, wow, "challenge accepted" was my first thought and so I started to write own software. We have about 90 retail stores and we wanted to achieve these goals for a proof of concept (and that are the goals EVERY retailer wants to achieve):
count the INCOMING customers
detect gender of customers
count OUTGOING customers
calculate, how long a customer stayed in the store
After some really short research I encountered these problems:
count the incoming customers
would be possible by analyzing the MQTT feed of a camera by calculating the moving direction of a person
If the MQTT receiver or the network wents down for some reason, you will lose data for that time period.
detect gender of customers
my approach was to analyze the MQTT feed, export all the single frames with the incoming persons and pass them to AWS or Azure face recognition (worked well in manual tests)
you have to do a separate API call for every frame. The camera then exports each frame in maximum quality and loads it onto AWS storage from where you then can download it again and do some further processing
We have 90 stores with >1000 customers a day.
you will instantly run into the API call limit per second/minute
waste of bandwith, as I would only need the exact person rectangle from the MQTT feed
not to mention the privacy aspect of storing images of persons on AWS where they are downloadable for everyone. Yes, it's a crypted link, but security by obscurity is no security.
duration of stay
simply impossible, because the same person is a new person for the camera every time it enters the fov of a camera
Axis does some kind of hash for a person and you're able to follow the person over multiple cameras
So to conclude I really don't believe, that it would be a big deal for Meraki to implement the following features, but a HUGE deal for your retail customers:
--- end of complaining---
implement an API call where you can request the camera to export multiple frames instead of just a single frame per call
it should be possible to define multiple needed rectangles per requested frame, so that only these will get exported
the requested frame rectangles should be stored on the camera instead of directly loading it onto AWS ( at least you should let the caller choose to do so)
the requested frames should be bundled into an archive instead of exporting every one separate.
make the API call async and give back a jobID or statusD. The caller then continues to poll and will download the export from the camera once its finished.
implement an API call to delete finished export archives manually before they expire automatically
if it's not possible to store the exports on the camera. encrypt them with a caller-provided password and THEN load it onto AWS.
stream the MQTT feed to the dashboard and make it available for download for the last n days
or store it directly at the camera. The stuff is ASCII and has a possible compression rate of ~95%
I think many retailers and 3rd party analytics developers would really appreciate this features.
I also wrote those ideas to some folks at Cisco and Meraki directly in last years November, but I never got any update on this...
Typically I have a camera focused tightly on the entrance and exit doors for people counting. Typically we use the MV22X now. It produces nice high-quality images that can be used for security mug shots as well. We tend to use MV12Ns for monitoring specific areas in stores (to determine the number of people going to different places inside the store). We use MV32s for determining peoples movements, but not typically for analytics.
I use an AWS Lambda script that downloads the data daily into our database. We are looking at moving to hourly so we can do live-updating dashboards.
To process gender and emotional state as well as number plates I have used MQTT. I have made that system freely available:
Note that I don't actually store the frame. I simply retrieve it in memory and then pass it directly to the end service (such as Amazon AWS Rekognition or the number plate recognition system). The script does include an option you can enable to store the frame locally if you like - but there is no requirement for this. Most of my customers don't store the frame (creates legal issues around privacy).
Note that I use node.js for this kind of processing because it is 100% async.
To calculate dwell time we use MRs and location analytics. We process this using an Amazon AWS Lambda script as well (in fact, we are 100% serverless).
I have a customer with their own data analytics team, and they also use both of the above but also integrate their POS data, advertising spend, weather data, and a few other bits and pieces for forecasting as well as comparing how things went compared to expectations.
The API limit - I 100% agree on this one. This is a major impediment. To get around this I create a separate organisation for every store for just the MV cameras. This gets you another 5 API calls per second and every org comes with 10 free Analytics licences. Funnily enough, Meraki has done themselves out of money because of the API limit. The last retailer I worked with had budgeted to buy the MV Analytics licences, but because of the API limit, we had to use separate orgs, removing the need to buy the licences.
I have griped to Meraki and the API team about this many times.
I sometimes run at maybe 200 concurrent API calls a second. I just wish Meraki wouldn't make me use separate orgs to get the job done. I'd much prefer to use use one org and seperate networks.
How do you utilize MQTT to detect gender? As the feed only gives you metadata, you need to export the image data anyways, right?
After I ran into the call limit issue, my 2nd approach was to grab the frames directly from the camera. Therefore, I tried to grab the video snippets that contains the needed frames, and then export all the single frames that i need with ffmpeg.
But all that stuff is not official API and I had to do lots of reverse engineering of dashboard<->camera-communication for that. As it was major pain in the a**, I stopped working on it in the meanwhile...
I subscribe to the MQTT messages and then wait till it says people are detected. I then use the reported time of that detection in the MQTT message and send a request to the snapshot API and ask for the frame at that time.
You tend to get bursts of people detected messages, so I limit my system to only getting one every 500ms at most.
I then retrieve that frame into memory and post it to the Amazon AWS Rekonigition API, and get back the age, gender and emotion state.
In our use case we wanted to batch-process the whole past day at night.
We recorded the MQTT feed along the day and then aggregated it after store closure to generate a list of frames that are needed to be analyzed and count the persons in/out.
We wanted to go this way because typically a person is in the FoV (and the feed) for multiple frames and we wanted to only gender-analyze the one with the highest confidence level. Therefore we put together all the frames that a person is on and found out the one with the highest confidence (there's a separate call for this on Azure). Afterwards we only took this one to do the gender (and age) recognition.
But for that to work in real life, we need to call the Meraki API 10.000 of times (for a single store) which is just impossible. Even if there was no call limit, we wouldn't get it done over night 🙂