Spoke/Hub architecture with multiple Subnets/VLANs at each spoke. Firewall/Group Policy questions
Wondering if I'm overcomplicating things here, or if this is a square peg/round hole situation.
We're in the process of migrating all of our satellite locations (roughly 150 of them) from ISR4321/Cat2960 hardware to MX67+MS120 (in most cases). I'm struggling to recreate the functionality of our SVI ACL's in the Meraki ecosystem.
In a nutshell, at any given satellite site the MX will have 2 WAN connections in an SD-WAN configuration. Each MX will have a full-tunnel AutoVPN tunnel established with an MX250 in concentrator mode at our Primary datacenter, and another tunnel established to our Secondary datacenter in an active-active configuration. Each satellite site has 10 separate vlans. The MX owns these.
For the sake of this post, an example would look something like this:
With the ISR4321/Cat2960 hardware, the VLANs belonged to the Cat2960. To control traffic in/out of the VLANs, we applied an inbound/outbound ACL to the SVI of said VLAN.
In short, the goal is to control what traffic is allowed between the VLANs at the satellite, as well as what traffic is allowed between the VLANs at the satellite and the larger corporate network.
Is the best way to recreate this in Meraki to use the group policies, and assign those to the VLANs at the MX? Can this same thing be accomplished with L3 firewall rules (i'd love to get away from stateless rulebases)? I'd much prefer to do this in a stateful manner, as well as take advantage of network objects/object groups.
This appears to be the correct answer, Meraki's network objects/object groups is the saving grace.
Previously, technicians had attempted to use the Layer 3 firewall rules but got hung up on the fact that those rules don't affect S2S VPN traffic and thought they would have to create all the rules 2x. After some thinking, reading, and some testing we ended up with the following (high-level) approach:
Enable Policy Objects (Beta) - there's a gotcha here where if you have too many objects in a group (or too many objects in general, I'm not certain) they won't all display in the Dashboard UI. The full list can be retrieved via the API though so it's no big deal
Create an object for each subnet, done via API scripting
As a side note, it would be nice if each object didn't need a 'name'. If no name is provided, the subnet address should be assumed as the name vs. requiring us to specify
Create an object-group for each 'segment' and fill with individual objects belonging to that segment. For example the ~150 enterprise workstation subnets from my first post go into the EnterpriseWorkstations object group. Add objects to object-groups in order to control deployment
Build S2S VPN firewall rules using the object-groups as the source/destinations. This reduces the # of individual rules by a vast amount, making them easier to visualize/review. Create 'default deny' rules for each object group where applicable (for the Phones or Printers object groups, as an example), factoring in the implicit allow at the bottom of the S2S VPN firewall ruleset
On the LAN side, the ruleset is standard accross the board. Powershell script to poll each 'network' for local LANs, and splat in the ruleset for each local LAN. For example don't let the Phones subnet talk to the Printers subnet, etc.
Nix the Layer 3 SVI Group Policies at each site as deployment rolls forward
This seems to be working just fine at the five locations I've migrated to the deployment described above. Unsure what performance constraints I may see (particularly at the VPN Concentrators) as this moves forward. Will follow-up if I stumble across anything.
@Crocker, as @rymiles stated, you can use the Layer 3 firewall rules on the MX. The Layer 3 outbound firewall rules apply to inter-VLAN traffic too (read Outbound as leaving a VLAN, unless its going to a VPN tunnel, in which the Site-to-site outbound firewall, on the Site-to-site VPN page, applies). These all support the firewall objects (although note that the Site-to-site outbound firewall is an organisation-wide configuration, not per network).
I'd also look at scripting your deployment (e.g. with Python) using the APIs too, if you haven't already. With that number of sites a little bit of up front effort will likely go a long way in easing your configuration and eliminating much of the chance of human error.
one other note. if you plan to use templates for the branch/spoke sites the firewall rules can use objects that reference the underlying site specific subnets. per your example, the template firewall rules could use objects for workstations, phones, printers, guest, and vendor. dashboard works out that those objects actually refer to site unique subnets underneath like 10.201.x.x, 10.202.x.x, etc.
templates aren't right for everyone as they only allow a certain amount of variability. when you have sites that are very similar templates can be great. you'll need to evaluate if it works for your deployment.