@Bruce, large retail chain, much legacy debt of a 60-some year old company, etc etc. This solution was purchased before the Cisco Borging, and I think mostly grown unwieldy now. We have a lot of active Meraki team involvement, but seems no one there has enough gumption to tell us we're doing it wrong, thus these things are biting us now quite regularly. I'm now the man in black bringing bad news to the table.
I appreciate you and @spadefist giving input on this, as again a bit of my first rodeo with Meraki. News to me it's capable of prepending even simplistically, as I've not found anything that says it does any metrics in bgp, automagically or other. This obviously means ebgp relations which I'm all for, but it's still a mix of legacy eigrp, kludge of ospf atop for Meraki as all they did originally, and bgp with mpls wan provider environments that's entirely discontiguous. I can't blame you for that, but lack of metric control options is, even if you hide it from the lay folk using these things.
Case in point, I've argued if not doing regional hubs, not concentrator mode, we split the ha and term to each store mx meraki separately with dual tunnels to fail over separately to each hub, assuming we could make routing failover faster. Ideally we go between sites, but again today we're ospf just for meraki with a squishy eigrp middle. Normally I'd do ospf external type 1, metric one site significantly higher, and run between them for one desired path or another standard, but needs done as a matter of policy, ideally from the meraki or it gets messy in translation (got tags?). Further complicate that with eigrp around it in the middle of everything, and it comes apart, but the rest of engineers feel ospf is a better long-term option so maybe feasible to overlay and migrate.
I've pushed for ebgp entirely, as I've built whole service providers and clos fabrics around ebgp, vxlan overlay/underlay, and not afraid, but everyone else is, so a bit of a non-starter. They might consider ibgp+ospf. but now we'd be talking local-pref injection which it doesn't do to control traffic between sites. I've been through igp+bgp vs. native ebgp-everywhere and prefer the latter, but it's still a foreign concept to most. It would be nice if these could support things like BFD as well for bgp fast failover, automagical or manual for downstream peering.
I just don't get a lot of control with the routing protocol for how the meraki's can signal downstream a flip in preferred paths between sites vs. dmvpn, viptela, or even fortinet products I've worked with over the years. It's annoying to me, thus the fischer price comment.
This is not typically my domain and try to stay out of it, but after enough rude overnight wakeup calls myself about these failing in ugly ways, they need to do something.
Oddly, the biggest issue we see has been with a few firewall failovers that cause hung connections in the meraki tunnels through our firewalls in the event of an outage. There seems to be no proper DPD sort of checking on the tunnels to restart sessions, or any sort of again BFD-style checking available to ensure tunnel health occurring to even restart the tunnel. This is another huge annoyance, as at least with DPD and a hung connection, a new inbound establishment should correct this, but instead we still end up with ~1200 retail stores stuck until we forcibly flush connections on the firewall, which causes the Meraki to trip out, break, go dual-active, etc due to load today. Yada Yada, round and round.
Seems some minor features could alleviate a lot of this, but just your average network monkey here. It would be nice to see better engagement with the Meraki folks, but not normally my domain, and no one seems to want to give some hard truths to the customer (except me) as a consulting architect.