Hi
We've come across a very strange issue that I thought was worth sharing with the fine people in the community.
Scenario is - MX68 HA pair running 15.42.1 connected to two Cisco routers, each with a fairly decent FTTC internet link - running with load-balancing switched on. Site has a cloud VOIP solution (Telcoswitch)
Router connected to Internet 1 develops a problem, is rebooting every 6-8 minutes. We raise a fault with our ISP and start the process to get the line/router checked.
All applications at the site are fine, except the VOIP phones. They can't get a stable registration, even though all the signs are that they're using the public IP of the working link.
We set internet 1 to be 'disabled' in the MX config, which we think will force all traffic to the working internet 2, regardless of the up/down state of the flaky router on internet 1. Again, all applications are fine, including auto-VPN, except VOIP phones.
We re-template the site, so it's internet 2 active, internet 1 unused, but still disabled. VOIP problem persists.
BT Openreach fix a fault on internet 1, and presumably reboot the router which is then stable for several hours. VOIP phones all recover, but again are using the public IP of the other line.
Router rebooting issue returns, VOIP phones go down. BT replace the router, which has now been stable for several days, and re-enabled in the Meraki config, but as 'ready' - Internet 2 is still the active link. Phones are fine.
Short version is; the VOIP system stability seems to be entirely dependent on the state of a router/link that is not actually being used to carry any of the traffic.
Only plausible culprit is the Meraki MX. We have a hypothesis that there's some kind of SIP processing in the MX that doesn't respect the enabled/disabled/active/ready state of the available links - some kind of processing order issue.
Whilst we've only seen this issue at one site it makes me worry about all of them that have this VOIP solution and multiple links - redundancy effectively creating a wider single point of failure.
Anyone seen anything like this? Thanks for your time if you've read this far.
Andy