First time posting here and I am hoping to get some more insight on the strange issues we've been seeing. I've looked through some of the posts and can relate to some of the STP weirdness you all are having.
I'll try to provide as much detail and info up front.
At this remote site (we have others doing the same exact thing) we run an EMR application. When clients are on their default VLAN77 they are having application hangs/crashes and slowness. If I manually swap a port to use VLAN1 instead of VLAN77, the issues go away and the application runs normally.
The application is installed on PC's, but is hosted elsewhere so it does route out to the Internet.
Appreciate any help, and if any other info is needed just let me know.
|Main Site||Connection Info||Remote Site|
|Switch: Cisco 2960XR||100Mb ELAN - Dark Fiber||Switch: Meraki MS250-48LP|
|Transit/Native VLAN: 30|
|Allowed VLANS: All|
|Port: Gi 1/0/46 - Trunk||Port: 48 - Trunk|
|Core Switch: Cisco 3750X||VLAN77 - 10.150.77.0/24|
|VLAN1 - 10.150.70.0/23||VLAN120 - 10.150.120.0/23 (VoIP)|
|VLAN77 - 10.150.77.0/24|
|VLAN120 - 10.150.120.0/23||Default Route: 0.0.0.0 - 10.150.77.1|
|4 - Meraki MR33 AP's|
|Cisco VoIP Phones & PC's|
The remote site is a Meraki setup, which comes back to the main campus via dark fiber to our Cisco stack which then will flow through our ASA out to the Internet.
We have other remote sites setup the exact same way, but with different VLANS:
VLAN 75 - 10.150.75.0/24 - Meraki MS250
VLAN 79 - 10.150.79.0/24 - Meraki MS250
All I have to do is set someone's port to manually be VLAN1 instead of one of the VLANS created for the remote side, and everything works without issue.
Are you stretching VLAN77 across the sites, you seem to have the same IP range in the main site and remote?
What IP address do the clients get when you change them to VLAN1 and where do they get it from?
What is the latency between the main site and remote site?
You seem to be using dark fibre but are connected to a copper port (48)?
Are you trying to do this all L2? Me personally I would turn on L3 on the MS250 and sent the default route to whatever IP is on the other side of the transport VLAN 30. And don't put 77 and 120 on both sides of the link.
So at the remote site, if I set someone to VLAN1 on the port they will get an IP from the 10.150.70.0/23 subnet. Normally their port would be set to VLAN77 and get a 10.150.77.0/24 IP.
All VLANS are setup and stem off our Core switch at the main location.
Between the sites it is Spectrum fiber, but yes it is an RJ45 ethernet handoff from their equipment to our switches.
This was a new setup for me and asked Meraki how it should be setup, so I just assumed it was right 🙂
L3 is on, here's a screenshot of the routes:
Remote location: Switch management IP: 10.150.30.41 - Native VLAN30
Main location: VLAN30 - VLAN Int 10.150.30.254
VLAN77 is basically the same setup.
Remote location: 10.150.77.1 - Switch IP/GW for clients
Main location: VLAN77 - VLAN Int 10.150.77.254 - GW and default route of the remote Switch.
Hopefully I'm explaining this correctly.
Okay, it looks like you have a VLAN 77 with subnet and interface on the ms250, but you don't have an interface for VLAN 1.
On the main site what are the IP settings for the VLAN interfaces and what does the routing table look like? You seem to have the same subnet there and maybe that is what stops the 10.150.77.0/24 subnet routing properly
Core Switch IP - 10.150.70.1 (which is also the default gateway for 70.x clients)
ip address 10.150.70.1 255.255.254.0
description *** South Campus VLAN ***
ip address 10.150.77.254 255.255.255.0
default route routes to our ASA firewall:
ip route 0.0.0.0 0.0.0.0 10.150.70.2
Correct, there is not a VLAN1 IP or Interface on the Meraki.
Are there devices in VLAN 77 at the main site? If so and you don't want to make to many changes I'd simply delete the VLAN 77 interface in the ms250.
If not and you want to properly use routing then you need unique subnets on each site and a single shared subnet between them that is used for the interconnect over the dark fibre.
If your latency site to site is 1ms or so then I'd go with the first option for now to keep it simple.
Latency is 1ms between sites. And no, only clients in the .77.x range are at the remote location. That was the whole point in me separating out remote sites mainly to have them on their own subnets.
I thought that was how i had it setup, with the unique vlan 77 at the remote site, and the vlan 1 at the main site, which is the main IP range 70.x/23.
And the vlan 30 is the one we have for switch management in between.
Is this not the ideal setup?
The only difference when I change a port from vlan 77 to 1 at that site, is the trace route will show the switch IP for the GW there, and then hit our normal GW.
So it would go: 10.150.77.30 -> 10.150.77.1 -> 10.150.70.1 -> then out the FW to the Internet.
If I change someone to Vlan 1 it goes: 10.150.70.x -> 10.150.70.1 -> out FW to the Internet.
I'm confused now, in the first post you have VLAN77 at the main site as well and later you have an IP in the 10.150.77.0/24 subnet at the main site. If these do exist at the main site get rid of them as that would confuse the routing. Is it possible to draw a diagram with main site, link and remote site listing each subnet and which one(s) go(es) over the link?
Ok you have L3 on 2 interfaces on the 10.150.77.0/24 VLAN. the core switch at your main location and your Meraki at the remote location?
Meraki L3 interface at branch is 10.150.77.1
Cisco interface (Assuming L3) at main is 10.150.77.254
I personally would not like to see broadcast domains, i.e., same subnet, go out over WAN links. You could have a packet storm going over your WAN link when the poor performance happens. Normally a router would not pass that traffic on but if the VLAN goes over the link the router is going to pass all the broadcast traffic unless you explicitly tell it not to. I see that you have the Meraki set not to route multicast but is the Cisco on the other side also set the same?
If you're going to keep this setup having the VLAN go over the WAN link, I don't see the need to have L3 set up on the Meraki. You could do the opposite of what I said before, make it L2 all the way back to your core Cisco at the main location.
Also I see the VLANS are configured to relay DHCP to something back at the main site. What information is DHCP handing out for this VLAN? Is the clients' default gateway .254 or .1? If you're going to continue to run L3 then the local Meraki (.1) should probably be the default gateway for clients. The Meraki switch has a static route back to the core, so you're good there.
It's hard to put a handle on exactly what's going on here. But it seems non-optimal. A packet capture while you're having the problem would be telling.
Thanks for the replies. Here's a quick diagram.
VLAN 30 is a switch management vlan which in this case I also used as the transit VLAN between sites and MGMT Interface on the Meraki Switch.
VLAN 77 was created on the Main Core Switch which has the 10.150.77.254 IP on the Interface.
VLAN 77 is the subnet I want the remote site to have and use for clients.
VLAN 120 is our VoIP vlan across all of our sites which is 10.150.120.0/23
If I left anything out or you need other info just let me know.
As far as DHCP, yes we have our DHCP server located back at the main campus on VLAN1 10.150.70.x.
Clients are getting the switch IP 10.150.77.1 as their default gateway at the remote site. And on the switch, the default route is set to 10.150.77.254 which is the Core Switch IP.
@ShawnHolcombe you should not have a gateway set in a client to a different address as that subnets next hop.
Let's say a client needs to access 18.104.22.168 and is on 10.150.77.100
First it goes to 10.150.77.1
That sends it to 10.150.77.254
It goes out to the ASA and the return packet gets to 10.150.77.254
10.150.77.254 can see the Mac address for 10.150.77.100 so sends it back directly, bypassing 10.150.77.1
The client now updates it's routing table to say that 22.214.171.124 is available through 10.150.77.254.
As you can see every single IP address the client talks to outside of the 10.150.77.0/24 subnet will have its own individual entry in the clients routing table. When you disconnect/reboot the process starts all over again.
Hey there, I see what you are saying I think. But what changes should I need to make on the Meraki switch as far as VLANS or routing is concerned?
I assume the switch having it's management IP of 10.150.30.41 is fine. But does it not need the 10.150.77.1 interface in routing? Are you saying clients should use the 10.150.77.254 as their GW, not the .1?
Not sure if this is what you meant or how to fix the issue, but in order to leave DHCP alone I just swapped the IP's of the Meraki and Core switch for VLAN 77.
Now clients will still get the 10.150.77.1 gateway, but that is now directly the core switch IP, bypassing the Meraki .254.
@ShawnHolcombe that is a good non disruptive way to test the 'Layer 2 VLAN stretched over the WAN link' option. If it solves the problem then you could leave it that way, if you are still having issues then you'll need to remove VLAN 77 from the core, return the .1 IP to the MS250 and make the MS250s default route the core switch's IP in the 10.150.30.0/24 subnet.
Well half a day has gone by and so far the results are looking good! I appreciate all the suggestions and help. If anything changes I'll be sure to update the post.