Share your network troubleshooting stories. Win swag!

CarolineS
Community Manager

MerakiCommunity-CommunityChallenge.png

 

dave-it-guy.png

As an IT professional, you probably spend a lot of time troubleshooting end-user performance issues. CEO can’t get on the network? Applications running slowly? Uh oh…

 

THIS MONTH’S CHALLENGE: Tell us about a time you used the tools in your Meraki dashboard to troubleshoot end-user performance issues. Pictures and humorous anecdotes are encouraged!

 

UPDATE: Congratulations to the winners, @geoffbernard and @PacketHauler! Read the announcement of the winners.

 

How to enter

Tell us your tales of network woes and troubleshooting in a comment on this blog post before 11am PST on Thursday (February 22).  (UPDATE: Submissions are closed!)

 

How to win

Voting begins at 11am PST on Thursday (February 22), and lasts until 11am PST the following Tuesday (February 27). (UPDATE: VOTING IS CLOSED! Say congrats to the winners!)

 

We will be selecting 2 winners:

  1. The Community Favorite — chosen by you, our Community members. Cast your vote by giving kudos ( kudos-icon.png) to your favorite entries. The entry with the most kudos from community members who aren't Meraki employees will win!
  2. The Meraki Favorite — the entry with the most kudos from Meraki employees will win the Meraki Favorite prize. Feel free to solicit your contacts at Meraki to vote for you. 😉

Good Luck!

 

 

 

The Fine Print

  • Limit one entry per community member per Community Challenge contest.
  • Submission period: Tuesday, February 20, 2018 at 11am PST through Thursday, February 22, 2018 at 10:59am PST
  • Voting period: Thursday, February 22 at 11am PST through Tuesday, February 27 at 11am PST
  • Prize will be a selection of Meraki swag with value not exceeding USD50.00
  • Official terms, conditions, and eligibility information
28 Comments
ScottR
Conversationalist

I have a Meraki Dashboard?

Oh my...

PacketHauler
Here to help

I had a remote site that was complaining of slowness in certain parts of the branch office during the latter part of the day. I started to dig into client statistics on the dashboard, and noticed that there was a group of about 10 systems that were transferring gigabytes of data within a very short period of time. It then would flood some of the uplinks of the switches they were attached to. The kicker that this traffic started once the site went to idle hours (after 6pm local time) and all were sending/receiving identical amounts of data. I ended up doing a packet capture on the switch ports of the hosts that were producing this traffic, and found that they were sending and receiving large amounts of IPv6 multicast traffic. After seeing they were HP desktops that were only generating this traffic, I went digging through the search engines for a possible answer. It turns out there was a known bug in certain driver versions of the onboard NIC of these machines. When the machines go idle, they end up being chatty with each other with IPv6 multicast traffic. The fix is to disable IPv6 or install a driver update. I sent out one of our help desk techs to assist on site with updating drivers on those specific desktop machines. Once that was completed, problem solved, have not seen that occurrence again.

 

Moral of the story: Remote packet captures are your friend with Meraki.

richdonegan
Comes here often

We found that a customer of ours was set up in Bridge Mode with their ISP. We got visibility to their dashboard by getting temporary admin rights and diagnosed pretty quickly. Informed them to reach out to their ISP to correct and they were back up and running in no time. Not an exciting fix but definitely shows how valuable the dashboard can be as we were able to diagnose remotely and in a matter of minutes making the customer extremely happy and a log time customer of ours!

Jalvey
Comes here often

Before Meraki:

Before MX84, MR52s, and MS250sBefore MX84, MR52s, and MS250s

After Meraki:

After installing MX85s, MR52s and MS250sAfter installing MX85s, MR52s and MS250s

 

Larrman113
Conversationalist
We just moved into a newly remodeled building with all new wiring, Meraki switches, and all the tools it offers. We had new Cisco phones too. We had one troubling connection that was difficult to diagnose and it appeared that several connections might be involved but everything was new...the wiring, the connection cables, the phones...I was able to use the Meraki tools like Cable Test and Ping to narrow down the problem. It ended up being a bad POE connection on the new Cisco phone but without the tools that Meraki has built-in, it might have taken more time and trial and error to determine the issue and get back up and running.
geoffbernard
Here to help

End users - the thorn in an engineer's side.

 

The Meraki dashboard is a (hmmm.. what adjective do I use because there are so many) comprehensive troubleshooting tool. Yes, the cloud management aspect means you don't have to do on-site. However, if you were on-site, what's the most useful troubleshooting tool? CLI? *yawn* What's that tell you? MAC addresses in use? VLAN/port configurations? How helpful is that? I'll tell you - not very.


How much bandwidth is a user taking? How long has it been since a device was online? Without external monitoring via SNMP of netflow, you can't find this info. Have you ever had to justify an expensive out of town trip only to find out you are no closer to finding the root cause of your network issues?

 

Here are 2 ways the Meraki dashboard has helped me - and it can help you too.

 

1 - Cable port testing

We've all seen this before right? (I took this picture at the last customer site I visited...)

dllhost_2018-02-20_14-56-16.png

Have you ever had to test one of these? 19 WAPs mounted in the ceiling of a warehouse. Customer claimed we had a bad switch port since 2 WAPs provided poor Internet access. But we tested all the WAPs and all worked correctly. What changed? Oh wait, customer provided the cable vendor.

Login to dasahboard, click on the switch, select the port, then scroll to the bottom. Ran a cable test & found the cable run was over 350feet. Hmmm... sounds like that's over the 100meter max to me. "Mr customer - I think you need a new (shorter) cable! May I recommend a low-voltage contractor that DOEs respect ethernet standards?"

 

2 - Client bandwidth usage

We had a client that upgraded from 10meg to 100 meg Internet. At first, things were fast. The client then reported backups were failing due to transfers timing out. 

Pop in the dashboard and check the client graph.

Related image - Looks like an issue to me. Click on the Usage column to sort then click on the top client. From there, click show details on the application graph.

BAM! We found the culprit. Bittorrent traffic from 8p-4a was saturating the circuit. This is where computer names are helpful. We made a call to the client with the affected PC name & were told they would take care of it. We asked about the owner of that PC so we could update our notes. "We'll take care of the user - rest assured there will be no more BT traffic from that PC." Again we asked who's PC it was... after a brief silence, our suspicions were confirmed. "It was the IT Director's PC."

 

The dashboard is not only convenient but a very helpful troubleshooting tool.

PhilipDAth
Kind of a big deal

Well, I had a customer with another brand of firewall I will not mention.  They had already decided to move to Meraki and the hardware was ready and available.

 

Their existing firewall started regularly crashing (several times an hour) and their Internet ground to a halt.  They had no idea why and no visibility. 

 

I was asked to urgently attend the site and do the Meraki MX deployment immediately.

 

10 minutes after I had plugged in the Meraki I had almost all the answers.  The circuit has being maxed out with One Drive traffic.  I blacklisted the computer asking the person to contact me.  Internet performance returned to normal and the company was able to work again.

 

The IT manager (as in the person who asked me to come to site urgently to fix the issue) contacted me saying that had a window pop up on their computer asking them to call me.  A short investigation later revealed that the IT managers husband, at home, had dropped 25GB of "Game of Thrones" into their One Drive.  This was rectified, and I removed the black list.

 

So the story finished "Happy ever after".

jeremyroe
Just browsing
We updated our ISP bandwidth and soon after started dropping packets like crazy. The ISP kept blaming the network and all signs pointed to their equipment. After weeks of troubleshooting and escalating with support we finally diagnosed the problem to an obscure Modem firmware bug that was triggered by remote connections to our line of business application. We had been in endless support phone call loops with the ISP, escalated to the manager, then the field supervisor and eventually the regional elite support team. The Meraki dashboard was invaluable to show the connectivity issue wasn't on our end and determine out the connections causing the issue. The info it provided allowed us to convince the support team the issue was on their device and quickly replicate the problem. They kicked the issue up to their firmware team and replaced our modem with another model solving our issue and getting us online.
JaiminPatel
Here to help

The company where I am working has multiple offices as well as a number of teleworkers, which can make networking and security monitoring a challenge. The Cisco Meraki wireless networking solution, lets the Network Team easily manage the disparate networks from anywhere and troubleshooting.

 

The old days of supporting our teleworkers’ personal home routers are gone. Let me say, what a relief!

 

Implementation has been smooth, and the integration with existing Cisco systems has given the client options for future equipment upgrades when they’re needed.

 

I am completely satisfied with the Cisco Meraki wireless networking implementation. Support has been great, and I have been able to reduce the amount of time I spend configuring and managing the network. And that allows me to spend more time working on performance tuning and monitoring our network.

Adam
Kind of a big deal

Our buildings regularly experienced network outages with our old Sonicwall platform.  The need to have redundant internet connections that could seamlessly failover and load balance became top priority.  We started migrating from Sonicwalls to the MX platform and also got secondary internet connections installed.  Implementation couldn't have been easier.  Now we have alerts to notify us if a WAN failover occurs so we can immediately work on any down circuits and also know that the site is still operating without even a hiccup.  

George_Thomas
Conversationalist

It was all nice and good experience. Hope I can win a swag!

TMRoberts
Getting noticed

When i started where i am now we had constant complaints of slow connections at branches, poor wifi coverage and remote access was through an old out of support citrix farm. We revamped the entire network from edge to access. Main branches got MXs with MSs also and MRs for wifi while smaller branches got MXs paired with one or more MR, or small branches/VIPs got Z1s. Remote access now had VPN through Meraki.

 

After we did all this in place we got a call from the main corp office that their internet, email, printing etc was all incredibly slow even with them having a fiber connection.

 

Digging into the dashboard for that branch we found out that one of the managers had connected his personal iPad to the secure side of the wifi network (a company owner had shared the passcode rather than telling the mgr to call for guest wifi). He was streaming Netflix and effectively shutting down the connection for everyone else as he had it as high def as he could get it.

 

Needless to say when we tracked it down and blocked it, the owner wanted info, so we advised who and what device and asked him some questions too. He advised, yes even though there was guest wifi, he had personally shared the unrestricted passcode to the mgr who had told him it was for "research" purposes when he needed wifi access.

 

After that episode the owner stopped sharing the passcode out and we changed to certificate and passcode authentication as well ... And the mgr got an unofficial wrist slap as we advised exactly what type of traffic was going to the iPad and the owner was NOT happy that even he himself couldnt work that day either.

MerakiDave
Meraki Employee

@CarolineS I don't think I'm eligible to win some swag, but I feel compelled to share this anyway!  This is a bit of a long story, but hang in there with me, it's worth it. This goes back many years but is one of my favorite network troubleshooting stories.

 

I was working for a major publishing company, and in one of the remote printing plants, there were 3 RIP (Raster Image Processor) servers (Windows PCs) that took in PostScript files and rendered and generated the raster bitmap that got sent to other systems to burn the actual plate that gets mounted on the printing press drums. They were having all kinds of "network problems" for many weeks and spent many hours troubleshooting locally, and many failed jobs had to be re-sent through these servers, some of them multiple times, slightly impacting their ability to actually print the paper on time.

 

After another couple weeks of remote network troubleshooting and even code debugging, they identified a lucky engineer (me) to go on site, which for me was from the east coast to the west coast of the US. And I had a whole bag (literally) of network troubleshooting tools, cables, meters, sniffers, etc. I spent a whole day on site getting everything set up and documented, ran test jobs to establish a baseline, and got ready for that evening's production run.

 

As the run got going and I was running around confirming all looked normal, despite that a few jobs had already failed. I took note that there were no jobs failing on RIP server 1, but there were multiple failures on servers 2 and 3. Seemed odd. The printing plant admin was using the console on RIP server 1 like usual, and since servers 2 and 3 were processing jobs, but nobody was actually sitting in front of those PCs, I started watching their CPU, memory, network and disk activity, etc. I noticed that while I was actively looking at these PCs, no more jobs were failing.

 

After stepping away for a bit, and jobs started failing again, but only on servers 2 and 3, I went back to these PCs perf meters. Jobs stopped failing like before. Then it all became so clear! It was the screensaver!  The local printing plant admin personalized these 3 PCs (which had terrible video cards) with a graphics-intensive screensaver, and totally killed the performance to the point that jobs were failing. The reason nothing ever failed on server 1 was because that's the console the admin always used, so the screensaver never kicked in. And when failed jobs had to be sent multiple times, it eventually got through because it eventually load balanced to server 1. And no other printing plants around the country ever had this issue, because they all had their default screensavers.

 

So in the end, this could have been a Dilbert comic strip!  After a month of troubleshooting and frustration and escalations and all kinds of blame-game antics going on, they flew an engineer across the country to fix it... by turning off the screensaver!

 

Kamome
Building a reputation

Few days ago -precisely, this Monday- almost all of my customer network's Internet utilization got really high. Really, really high.

At first, backbone team used Netflow to find cause. Export Internet gateway switch's Netflow data into ELK stack, but it doesn't give useful results.

But, then, my supervisor checked one of customer's site network which has installed Meraki MX device. That site is routing all of it's traffic(including Internet) into datacenter's Meraki MX device via S2S VPN. When he checked Client page in dashboard, and voila! We can find cause of high traffic usage.

 

According to dashboard's client application statistics, Adobe and Microsoft update URL were eating quite amount of traffics. Because Meraki Dashboard shows exact URL, we can conclude that cause of high utilization is due to Adobe Acrobat Reader and Windows updates. (It was after 4-day holidays.) My boss got very impressed about Meraki's Traffic Analytics feature, so he ordered to my supervisor to test and buy more Meraki devices 🙂 Plus, we got another appealing point to customers to get Meraki solution.

BlakeRichardson
Kind of a big deal

I had an incident two days ago where I noticed in our network client list an Axis camera connected to a subnet it shouldn't have been. A quick look at the client > Switch > port > patch panel lead to the device being connected in a highly sensitive area. I immediately shut down the connection and traced the device...... Only to find it was a Sony smart TV that Meraki for some reason thought was a camera.

 

Long story short Meraki is awesome at being able to trace devices down to they physical connection with little hassle. 

CHN
Conversationalist

Hi, 

 

Tools used to for end user performance issues in Meraki dashboard. 

"Channel utilization live tool" it's used to measure the continuity up/down status of the end user by showing some particular colour like up-green and down-red. 

Also we can see historical data and live report of each user, application wise and device wise as well. 

Uberseehandel
Kind of a big deal

@BlakeRichardson

said - 

I had an incident two days ago where I noticed in our network client list an Axis camera connected to a subnet it shouldn't have been. A quick look at the client > Switch > port > patch panel lead to the device being connected in a highly sensitive area. I immediately shut down the connection and traced the device...... Only to find it was a Sony smart TV that Meraki for some reason thought was a camera. 

Long story short Meraki is awesome at being able to trace devices down to they physical connection with little hassle.

 

Are you sure it is a Sony Smart TV?

 

My experience has been that Meraki identifies the Android Smart TVs from Sony, but incorrectly identifies the sound bars/sound bases/home theatre units.

 

I have previously lodged a report of a sound base being mis-identified as an Axis camera and after reading your post checked how the dashboard identifies a sound base on the test network - Sony DD-WRT router  - the adjacent Sony Bravia Android TV is correctly identified as Sony Android. The relationship between the two can be quite symbiotic. The devices are ChromeCast capable and it is easy to automate them so that when my phone come within BLE range they are turned on. But being IOT-level kit, I am concerned about how secure they are, and how often security patches are issued (August 2017). Unfortunately, I have not yet managed to get a ChromeCast client device on a secure network to access the ChromeCast Server capable devices when on a different VLAN, despite enabling Bonjour and opening up the ports that Google claims are required.

 

But one of the benefits of using Sony TVs is that they make better 4K monitors than most of the other brands.

 

teabread90
Comes here often

Let's just say now all I do is whitelist this and whitelist that... Smiley Sad

 

Darnit! Life used to be so hard... 

Any suggestions on how to kill all the free time caused by Meraki, I'm open to anything?

 

Also, how do you give kudos in here?

Maureen
Conversationalist

My client contacted me saying everything was running slow at their office.  In the past this general complaint would of made me sigh deeply because I work remotely hours away from them. Now with the Cisco Meraki portal it's like giving my sight back.  I immediately logged onto the portal and started checking the firewall and switches health then stats.  In a short period of time I was able to see that the Uplink on the firewall was logging a ton of lost packets.  Not only did it clue me in, but it gave me some solid information to go back to the ISP and let them know so we could resolve this quickly.  Thank you Cisco Meraki for giving me the tools and virtual eyes to speed up the troubleshooting.

Cyril
Conversationalist

Sometimes, I had a huge spike on my internet connection and I didn't know why. All my network was really slow and I didn't know why. And I didn't have any statistics on the network. I only have a MS and few APs. What I did is put the MS before my core switch just after the firewall. And just like magic, the statistics began to appear 🙂 

Just few hours after I did this change, I had statistics on my network. I knew how much traffic goes to my FW and where it comes from. I was abale to find the reason for all these spikes. Turns out it was iOS update, Windows update and RDP to a Linux server using xrdp.

 

Morality : Have statistics on the Dashboard can save you a lot of time for investigation and if it's not enough efficient, the online packet capture tool can help you too 

 

Thanks Meraki to develop such cool tools. 

matthew4osu
Here to help
One of our largest customers is migrating from traditional Cisco to Cisco Meraki. Meraki has allowed teach the NOC a given skill set using the dashboard. The dashboard allows the NOC a nice collection of clearly defined tools to troubleshoot before escalating to engineering. The customer is locked down to a template, so the dashboard warns the NOC of the configuration change before an outage would occur. The dashboard has given me my evenings and weekends back thanks to a simple and power tool. Thank you Meraki 🙂
CarolineS
Community Manager
@teabread90 wrote: Also, how do you give kudos in here? 

I’ll turn kudos on for this contest on Thursday 11am Pacific!

 

I’m curious for opinions though - do ya’ll prefer separate contest submission and voting periods, or should we just have voting on the whole time? Feel free to PM me your opinion (so we don’t pollute the contest entries here!).

ARiK_LeV
Conversationalist

I think this is great. I've used the dashboard for lots of troubleshooting. I don't want the prize but I say kudos to CarolineS  that created this request.   The examples given by the various community members are great and helpful.    

Ichimoku
Conversationalist

I have an issue where end users are not getting network connection and the local IT have check and found out that those workstations are getting APIPA. I have use the Meraki Dashboard to check easily if the ports are configured in the correct VLAN and are allowed on trunks. I have found out that the VLAN they are in is not yet allowed in the trunk, once I have added it everything works fine 🙂

Mr_IT_Guy
A model citizen

We have had issues in the past where our offices complain about dropped connections and some packet loss. Using the uplink statistics in Security Appliance > Configure > Traffic shaping, I was able to put monitors on the various hops to 8.8.8.8. While the list is sometimes long in some places, doing this allows us to see the hops along the way. From here, we work with ISPs to remedy trouble spots in their network! Our employees are happy and our ISPs are happy.

AndyC
Conversationalist
Nothing to technical from my experience, more of a funny moment. A doctors surgery reported bandwidth issues, slow internet etc. Only on Thursdays.... We decided to camp out on the Dashboard one Thursday and noticed (name of doctor's sons's) MacBook Air. After a quick look up we could the little ol' chap was streaming big time on the ADSL service whilst also WEB GL gaming. His dad put a stop to that. So we didn't have to. The practice was happy after that.
BHC_RESORTS
Head in the Cloud

I (the CIO) came on-board BHC Resorts to revamp and overhaul everything. Last guy had a custom made firewall built on PFSense, which is a great product, but his GUI was riddled with security holes, didn't use TLS (or even SSL most of the time), most of the features didn't work, and was slower than the line speeds we had. And to top it off, take a look at the GUI homepage. Keep in mind the majority of the buttons on the left didn't work or 404'd.


Terrible Custom FirewallTerrible Custom Firewall

No, I don't know why the Muppets Chef is there. Troubleshooting was literally, non-existent. There was no logging, no DHCP lease view, absolutely zero visibility in to the network. You could do basic port forwarding and that is about it. Fast forward to today:

 

meraki.jpg

I mean, there isn't even a comparison, right? We can view analytics, troubleshooting ports, we have way more data than we use. And we love it!

VascoFCosta
Getting noticed

Thanks to Meraki dashboard, I was able to prove to a chain store "brass" that the wifi problems were occouring only at the 2,4Ghz due to unmanagable interferences (you can neither ask the Store Security to switch off the alarms because of their motion detectors neither ask the store customers to switch off their phones/personnal hotspots when they enter a store).
The sollution was to force the corporate end user equipements (bar code readers, laptops, corporate mobile appliances) to preferably use 5Ghz and replace those that were 2,4Ghz-only.
Problems solved and a very happy customer.