Thanks for that insight. More details for those reading this thread as I spent a couple hours on with a knowledgeable Meraki tech, but the only resolution so far was to Roll Back the MAIN (Data center) MX68 and MX100.
Edge routers, a mix of MX64, Z1, and Z3 devices. (All Non-Meraki VPN)
3 weeks ago we updated all the edge (Branch routers) to whatever the recommend latest firmware was, Some went to 15.42.1, others were older Z1 and went to 14.53.
Next day after update we get calls from branches that their down, look at dashboard and VPN shows up. But we cannot pass traffic over the VPN. Reboot Branch router (the one with New FW) and VPN reconnects and data passes.
Works all day then another site calls same issue, same fix.
We assume it is because the Datacenter Routers (MX68 and MX100) are not updated, however we don't have a maintenance window for 2 more weeks to update. We wait, and just reboot the problematic branches early in the morning to mitigate the calls. Problem persists and we roll back a few to 14.53 and hey become stable for the next two weeks.
Maintenance day comes and we update all branch routers we had rolled back and the Datacenter routers.
It has now been just past the 2 week roll back for some branch routers we were not getting as frequent calls so they were not touched.
After Maintenance was completed the techs started testing connectivity to branches, all were down and would not come online, restarted all branch routers and 1 Z3 and an MX64 connected, but none of the other 20 locations would connect Phase2. Rolled back an MX64 and still nothing.
Called Meraki Support, spoke to tech who did lots of packet capture to determine in fact it was Phase 2 not negotiating. Tried RemoteID, still no connection. Tried roll back one branch, no connection.
Rolled back Data center MX100 - - all reconnected!
I would love to know more about the RemoteID and LocalID setup if anyone can expand on that if it resolve tehis similar issue in your environment.
Meraki support requested another call, but it requires all the branches be offline during testing, so it mat be a few days before the client (who likes to work 24X7) will let me take it down at a legitimate hour so I don't have to be up from midnight to 4 AM troubleshooting 😞