WAN failover - flow timeout?

Matt-Ignite
Here to help

WAN failover - flow timeout?

Hi team Meraki,

 

I come before you again with another WAN failover problem. Although, to be fair, I think this is really a continuation of the same problem I've had for years, just with a new client.

 

The question: Is there any way (either from the front end GUI or via a support team modification through the back end) to configure a timeout for active flows to reset routing preference?

 

The backstory:

 

Meraki MX64

  • WAN1 - wired NBN internet link. 
  • WAN2 - 4G modem with static IP

 

Meraki is configured with WAN1 as the primary uplink, with failover to WAN2 as needed when WAN1 drops.

 

Problem:

  • Client has an onprem VoIP phone system.
    • Phone system maintains a long-running SIP connection for signaling back to the provider's environment
    • Phone system processes the actual voice calls on a different IP/port socket. This bounces up and down as calls are made, etc.
  • Client WAN1 has been unreliable, and dropping out. 
    • Aside - I suspect these dropouts are false-positives and are caused by link saturation which causes the Meraki DNS probes to time out. Either way, it doesn't affect the actual problem
    • When WAN1 fails, all traffic is mapped over to WAN2. Phones continue to work, etc.
    • WAN1 comes back up. Web browsing, etc, is mapped back to WAN1. Easy peasy  - these flows start and stop all the time, so it's easy for them to time out (~5 mins from memory) in the Meraki and then be established next time on the primary WAN.
    • But the phones, specifically the Signaling connection, stays on WAN2. Because this is a long running connection with keepalives, it never drops. If it does drop (by restarting the VoIP server, for instance), it attempts to reconnect to the upstream WELL within the 5min flow timeouts. Because of this, it NEVER fails back from WAN2. It just stays on WAN2, chewing up expensive data and having all the unreliability of a 4G/cellular connection.

 

I understand that the software is designed like this so that it doesn't induce a second outage on the failback. For most things, this is fine. But in this scenario it means that some things stay on the secondary link forever. 

 

What I'd ideally love is a way to keep the default behaviour, but mark specific flows as needing to be reevaluated every so often (30 mins, say?). If after 30 mins the flow is reevaluated and better matches a different route, the flow preference should be hard-switched to the new interface/route. If that causes the connection to drop and need to be reestablished, so be it.

 

Am I dreaming here, or has someone else already solved this problem with Meraki devices, long-running connections and WAN failover?

 

Cheers,

Matt

5 Replies 5
PhilipDAth
Kind of a big deal
Kind of a big deal

I've seen this exact issue.  I don't believe there is anyway, via support or other, to control the flow timeout.

That's what I thought/feared. I really wish there was more control over the flows in general. 

 

I'm considering going real low-tech, with a $9 manual timer on the power socket of the 4G modem. Set it to drop out for 10 mins every 24 hrs at night time, forcing the Meraki to fail everything back to the primary link. 

 

https://www.jaycar.com.au/24-hour-mechanical-timer-a-n-switch/p/MS6113

 

Don't get me wrong - I don't *LIKE* it, but it might be the cheapest and most reliable way out of the problem.

 

Cheers,

Matt

I've done something similar, but slightly higher-tech.  I've used smart PDUs before.  Basically a rack PDU with an Ethernet port.

You can both assign schedules (for the better ones) as well we manually web browse in a manually turn things off and on remotely.  Most will also let you see power consumption.

 

I've found some 4G modems in weak rural areas are just not reliable without getting a regular power cycle.

Hi,

 

I had used rack PDU and also installed APC Back-UPS no issue found till now ...

 

Ebeltoft
Conversationalist

I have the exact same issue with several customers.

Is it impossible for Cisco Meraki to solve this? 

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels