Sensor saved the datacentre - again!

cmr
Kind of a big deal
Kind of a big deal

Sensor saved the datacentre - again!

This morning, for an as yet unknown reason, both of the independent air conditioning systems in our datacentre failed within a short time.  The Meraki MT14 was the first to alert us and allow us time for the engineers to get them back online before the servers etc. overheated.

 

Thank you Meraki (again)!

 

Now if only Meraki made AC units... 😎

21 Replies 21
Inderdeep
Kind of a big deal
Kind of a big deal

awesome !

Regards/Inder
Cisco IT Blogs awarded in 2020 & 2021
www.thenetworkdna.com
GIdenJoe
Kind of a big deal
Kind of a big deal

Very cool.
Would be funny to use your MV in there to view your server melting away 😉

cmr
Kind of a big deal
Kind of a big deal

MV is there, it acts as the gateway for the MT.  Recovery wasn't as simple as I'd originally thought, the primary AC units all failed and we have a plethora of temporary units in place.  Ambient temperatures went up to 46C, with noise at 91dBA and though now down to 18C, we have had up to 79 percent relative humidity.  Luckily nothing so far has failed (other than the AC units)...

 

Happy (UK) fathers day!

BlakeRichardson
Kind of a big deal
Kind of a big deal

Great story, i'd be interested in seeing a graph of the temperature increase over time, it always amazes me how quickly the temperature can climb in a data centre. 

 

Were you onsite at the time and were you able to restart the AC or how did you remedy the situation? 

 

Obviously data centres are sensitive so if you can't share not a problem. 

cmr
Kind of a big deal
Kind of a big deal

Here it is and no, not onsite, until later when I brought additional temporary AC...

 

cmr_0-1687124090472.png

You can see one set of AC fail at about 6am and the other fails at 11am or so.  Some restarts and then bang, both gone for good!  About 4pm the best temp AC arrives, then more and more.

BlakeRichardson
Kind of a big deal
Kind of a big deal

@cmr  Thanks for sharing, 46 is hot, talk about working under pressure! Heat, noise and trying to keep servers online. 

 

Hopefully nothing was damaged and no one was hurt. This will be a good learning exercise and I am sure you are already thinking about ways to mitigate this in the future. 

 

As is often said failure is the best way to learn. 

cmr
Kind of a big deal
Kind of a big deal

Thanks @BlakeRichardson so far, so good and indeed, deep dive time...

PhilipDAth
Kind of a big deal
Kind of a big deal

It feels good hearing stories like this.

MerryAki
Building a reputation

That the usecase of a monitoring like this. Some time ago we had a similar scenario which we then resolved using an on prem Monitoring (checkmk/PRTG) 
But still, its important to check it regularly and to set tresholds.

Madhan_kumar_G
Getting noticed

WoW

cmr
Kind of a big deal
Kind of a big deal

Another day and another MT saving a site's main patch room and internet streaming service.  Just ordered another 8x MTs to increase coverage as our AC units seem to be 'sub-optimal'... 🙄

Angela1
Meraki Employee
Meraki Employee

@cmr That's great news! "with noise at 91dBA..." I wonder how you use the ambient noise metric to diagnose issues in your data center? Seeing as temperature and humidity are the main / most direct metrics. Is ambient noise something you typically look at in your data centers as well? @BlakeRichardson as well given your above post

cmr
Kind of a big deal
Kind of a big deal

@Angela1 it was more of a comment, explaining how loud it got with all fans at 100%, it might make sense to set an alarm at ~80dBA though as standard levels are consistently about 76dBA.

BlakeRichardson
Kind of a big deal
Kind of a big deal

@Angela1 ambient noise is hard, we only have one MT14 and the small server room it's located in averages 63dBA which mean I've had to setup the alert profile to only alert at levels above 70bDA. There is only servers and switches in the room but they are noisy.

 

Thanks Blake! Sounds like noise is not a key indicator for you. Have you been able to take any meaningful action based on the 70 dBA alert you've set?

Yes 70dBA is a relatively safe level to work in so doesn't require worked to wear hearing protection or minimise time spent in the space which is useful to know for employee safety. 

MerryAki
Building a reputation

Think I need another MT in a Office Space. Came to one of our branches today and the A/C was not sufficient. Next time I would check the dashboard to see what's going on. Today we had like 30⁰C / 86°F outside and inside

 

Unfortunately the MT14 which I would like to use in the office is not available. 

MT14 reports on temperature as well as humidity, ambient noise, TVOC, and PM2.5, so it'd definitely be able to help you with that!

PhilipDAth
Kind of a big deal
Kind of a big deal

A special note around PM2.5 - you'll want to power the MT14 via USB-C to get full functionality.

SC-12
Meraki Employee
Meraki Employee

This is so great to hear. Anyone else have stories of where MT has saved the day? I'm always expecting mine to catch a water leak somewhere.

K2_Josh
Building a reputation

It would also be nice if the Meraki could alert on switches (and firewalls!) with suddenly elevated temperatures before they reach whatever is considered 'critical'. Yes, these devices may run hot normally, but Meraki could let users set the threshold for when to send a warning temperature and when to clear the alert.

https://community.meraki.com/t5/Security-SD-WAN/MX-Temperature/m-p/81273

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.