Hi @CaptainDan and @MikeG1 ,
Apologies for the late response here - for some reason I never received an notification about your posts.
I never resolved this.
In fact, shortly after posting this the client was so upset with the unreliability of the phones that they pulled all contracts with me and moved to a new provider. Honestly I can't blame them - I spent months trying to get this to work, and it simply wouldn't.
3CX blamed the Meraki device and how it managed it's failover. Meraki blamed 3CX and said their device was working perfectly. Each party was able to produce logs that backed up their position, but at the end of the day the system as a whole simply didn't work.
I suspect Azure IaaS might have had a hand in it, specifically how it handles it's networking internally. I found traces of different forum threads across the internet that suggested it was doing something very strange that normally no one cared about, but this exact combination tripped up on. I know people who are running AWS-hosted 3CX with Merakis, and it's fine. I know people who are running Azure-hosted 3CX with other routers/firewalls/UTMs, and they're fine. I know people who run 3CX on-prem on a NUC behind a Meraki, and it works fine. But there was something with this combo that simply did not work reliably.
From memory, I was also looking at a potential problem to do with the SIP INVITE Header field from the Yealink phone getting too long. I can't remember specifics, but I remember eventually finding one forum thread somewhere where a handful of people were having the same problem. I can't find it now, of course.
My next port of call was to deploy a Pi-based SBC to the site, to see if that made the problem any better. 3CX at the time didn't manage the SBC's well (the SBC's themselves worked well, but there was no visibility from the PBX into the SBC status or any remote manageability. This has now got a lot better), which is what had held me off. I had the SBC built and couriered to the site - the morning it arrived was the morning that I received the phone call about losing the client. Again - completely understand where they were coming from.
NB - in my case the SBC would have only affected the single Yealink IP phone. The softphone apps all create their own tunnels back to the PBX. Given that the softphones were also giving me a lot of grief, deploying an SBC would have only, at best, helped shed some light on the problem rather than resolve it completely.
My original thread in the 3CX forums is here:
https://www.3cx.com/community/threads/yealink-t58-407-proxy-authentication-required.58528/
I've got a feeling that wasn't the end of it though - it wasn't as simple as "the Meraki was dropping inbound packets on the secondary when the primary came back up", otherwise I would have simply replaced the Meraki. Again, from memory (which really isn't good), I think packet captures on the inside of the network showed that wasn't the case.
I'd be very interested to know if either of you gents were able to resolve this, or even if you managed to make any headway in troubleshooting it. I spent months and months on it, and came out even more confused than when I went in. I'd consider myself pretty adept at networking and IT in general, but I came up absolutely empty on this one.
Sorry!
Cheers,
Matt