VOIP disconecting after 15 min.

StevePF
Getting noticed

VOIP disconecting after 15 min.

Hi,

I was wondering if you could suggest where to look to dig a little further in my investigation.  VoIP users (Cisco phones with UCS in data centres) are reporting some random call disconnect after about 15 min. So far, it's reported from sites associated to a specific template (less than 10).

We have 80 sites equipped with MX68 that are grouped in 12 Templates.  They are configured in spoke with auto vpn toward a 4 MX250 in our Data Center with the same configurations (I think I rechecked half a dozen times so far).

I ruled out the hubs, the internet access (WANs, MOS), VPN disconnected and any NGF functions.

For a while, the SD-WAN policy for VoIP was set up to prefer WAN2 with failover with the VoIP performance class.  Then I changed it to best for VoIP.


Any pointer would be appreciated.

 

Thanks

Steve

 

 

13 Replies 13
CoreyDavoll1
Getting noticed

That feels like a keep alive timeout.  Back in the day I had to make special rules if traffic was going through older firewalls but I haven't had to do that in a while.  You might want to try to run a packet capture and take a look there, or engage support.

alemabrahao
Kind of a big deal
Kind of a big deal

The question is, when you changed the SD-WAN policy did the problem persist or not?
I see that the best option at this moment is to perform a packet capture during a call to try to locate the possible call disconnection problem.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
StevePF
Getting noticed

No change on the SD-WAN policy.  Trouble still happens. I am trying to do some packet capture but the trouble is random and usually happen after 15 min.  it’s like winning the lottery if I can capture the right stuff.  LOL

alemabrahao
Kind of a big deal
Kind of a big deal

Have you had any recent changes in environment? Like a firmware update for example.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
StevePF
Getting noticed

I did upgrade to MX 18.107.2 but the trouble shown before and after the upgrade.

Also no issue reported to the 70 + other sites.

 

I am trying to see if there is some missing packets at the DC.  None I can find so far.

But I noticed that some tcp 2000 packets were not classified as VoIP by the Meraki... but either way, I am very far from congestion.

DarrenOC
Kind of a big deal
Kind of a big deal

Hi @StevePF , do you know which call scenarios are failing?  Are we talking internal calls on a single site, inter-site internal calls, external etc?  I assume you’re using SBCs with SIP Terminated in your DCs?  

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
StevePF
Getting noticed

Hi Darren, we are talking multiple sites but all coming back top the same hub.  However, I have about 30 other spokes coming back to that hub with no trouble.

The called seems to be external call, where there will be terminated to common SIP trunks in our DC.

casey_1123
Conversationalist

We had a similar issue in a couple of sites with random voip and video calls. The issue was due to SNORT crashing our MX. Upgrading to 17.10.9 has a fix for this. Not sure if 18.107.2 is affected by the same bug, but might be worth to ask meraki support to check it out. 

BlakeRichardson
Kind of a big deal
Kind of a big deal

Just going from a different angle here there isn't a call duration limit set on the PBX?

If you found this post helpful, please give it Kudos. If my answer solves your problem, please click Accept as Solution so others can benefit from it.
StevePF
Getting noticed

We are using UCS with sip gateway.  The only close thing I would think in regard timeout would be some kind of TTL on a protocol.  But if any, it would expect a bigger impact across all sites

amabt
Building a reputation

Sounds like a NAT or keep alive issue. 1st order of things if give effected devices a reboot!

StevePF
Getting noticed

NAT and reboot was the first thing that was looked.

Crocker
A model citizen

We bumped into something like this at a subset of sites after upgrading from MX16.X to MX18.X earlier this year.

 

At the most problematic site, it turned out that the upstream ISP modem had its DHCPv6 server enabled, with a 15 minute lifetime. Every 15 minutes, that lease would renew, and that would cause the MX to drop 1-3 packets.

 

This was the same issue at the other sites that were affected, though the DHCPv6 server was handing out longer leases at those locations. Disabling that feature on the ISP modem was the resolution for this.

Get notified when there are additional replies to this discussion.