SLOW INTERNET ON CORE SWITCH MS425

thierryncho1986
Here to help

SLOW INTERNET ON CORE SWITCH MS425

HELLO ,

WE HAVE slow internet connections on all VLANs that are created on the MS425 core switches

i need for help

23 Replies 23
alemabrahao
Kind of a big deal
Kind of a big deal

What tests have you already performed? Have you performed any recent upgrades? Do you have a firewall after the switch?

If you haven't yet, I suggest you open a support case.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
thierryncho1986
Here to help

The problem has been going on for seven months.
We opened a ticket with Meraki support.
Packet captures were taken, and support noted the slowness on the VLANs from the core switches.
The following solution was proposed:

Our analysis of the captured traffic reveals several network-related issues that are likely contributing to the performance problems. Specifically, we observed the following:

TCP Retransmissions: The analysis of the traffic to both destination IPs shows significant TCP retransmissions. Retransmissions occur when a sender doesn't receive an acknowledgment for sent data, indicating packet loss or delays.

This is a primary cause of slowdowns, as data needs to be re-sent, consuming more bandwidth and increasing transfer time. Increased Latency/Time Deltas: We noted noticeable time delays between packets. This increased latency slows down communication, as devices must wait longer for responses, impacting the overall transfer rate. TCP Window Size Fluctuations:

The TCP window sizes vary, which can be a sign of congestion control mechanisms being triggered. While window size changes are normal, erratic or consistently small windows can further contribute to performance bottlenecks.

These observations strongly suggest that network-level issues are impeding the Nutanix update process. To resolve this issue, we recommend the following troubleshooting steps: Investigate Network Congestion:​​​​​​​

Uplink Utilization: Monitor the bandwidth usage of the switches' uplink(s) to identify any potential bottlenecks.

Network Path Analysis: Use traceroute or similar tools to map the network path between the Nutanix servers/clients and the update servers  to pinpoint potential congestion points.

Check Physical Layer: Ensure all network cables are properly connected and in good working order.

Quality of Service (QoS): Review your QoS configuration on other network devices to ensure that Nutanix update traffic is being prioritized appropriately.

Isolate Network Segments: If possible, try to isolate different network segments to determine if the issue is localized to a specific area. Switch Restart (Troubleshooting Step): As a troubleshooting step, consider performing a controlled restart of the core switches (MS425) during a maintenance window.

This can sometimes resolve transient software or hardware issues that might be contributing to the problem.

Mloraditch
Kind of a big deal
Kind of a big deal

Which of these steps have you done and what were the results?

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

The problem has been going on for seven months.
We opened a ticket with Meraki support.
Packet captures were taken, and support noted the slowness on the VLANs from the core switches.
The following solution was proposed:

Our analysis of the captured traffic reveals several network-related issues that are likely contributing to the performance problems. Specifically, we observed the following: TCP Retransmissions: The analysis of the traffic to both destination IPs shows significant TCP retransmissions. Retransmissions occur when a sender doesn't receive an acknowledgment for sent data, indicating packet loss or delays. This is a primary cause of slowdowns, as data needs to be re-sent, consuming more bandwidth and increasing transfer time. Increased Latency/Time Deltas: We noted noticeable time delays between packets. This increased latency slows down communication, as devices must wait longer for responses, impacting the overall transfer rate. TCP Window Size Fluctuations: The TCP window sizes vary, which can be a sign of congestion control mechanisms being triggered. While window size changes are normal, erratic or consistently small windows can further contribute to performance bottlenecks. These observations strongly suggest that network-level issues are impeding the Nutanix update process. To resolve this issue, we recommend the following troubleshooting steps: Investigate Network Congestion:​​​​​​​Uplink Utilization: Monitor the bandwidth usage of the switches' uplink(s) to identify any potential bottlenecks. Network Path Analysis: Use traceroute or similar tools to map the network path between the Nutanix servers/clients and the update servers (Akamai) to pinpoint potential congestion points. Check Physical Layer: Ensure all network cables are properly connected and in good working order. Quality of Service (QoS): Review your QoS configuration on other network devices to ensure that Nutanix update traffic is being prioritized appropriately. Isolate Network Segments: If possible, try to isolate different network segments to determine if the issue is localized to a specific area. Switch Restart (Troubleshooting Step): As a troubleshooting step, consider performing a controlled restart of the core switches (MS425) during a maintenance window. This can sometimes resolve transient software or hardware issues that might be contributing to the problem.

DarrenOC
Kind of a big deal
Kind of a big deal

Hi @thierryncho1986 

 

What device do you have upstream of your core switch?

 

What are your connection speeds like if you hook directly into your ISP router?

 

Is your core configured correctly as your STP Root?

Darren OConnor | doconnor@resalire.co.uk
https://www.linkedin.com/in/darrenoconnor/

I'm not an employee of Cisco/Meraki. My posts are based on Meraki best practice and what has worked for me in the field.
thierryncho1986
Here to help

WE have a firewall

with fai we have 46 MBPS , but when  connect on ms425  we have 20 MBPS OR 30

 

MS425 is STP Root

When you launch a web page, you have to restart it several times before it opens.
Plus, there's a lot of Mac address flapping.

Brash
Kind of a big deal
Kind of a big deal

What you have described is a relatively complex networking issue which most likely require a deeper investigation than the back and forth of posts in the community board.

 

What Meraki support have described isn't inaccurate. TCP retransmits can definitely lead to performance degradation. However, it is a symptom of the issue, not the cause.

If you know the issue is specifically when the MS425 is involved, I suggest following the path of a specific test and capture at all points. It should become pretty clear that there is either:

 - Packets missing from one of the captures (Eg. dropped packets)

 - Large delays in packets seen in one of the captures

 

Based on that, you should be able to trace the point of where the issue is. The next step would be to investigate whether it's congestion/oversubscription, faulty hardware, MTU mismatch etc.

 

Additionally, you've identified a lot of MAC flapping. I suggest investigating which ports MAC's are flapping between and the cause of that.

RWelch
Kind of a big deal
Kind of a big deal

A couple of quick things to check regarding the MS425-16:


1. MTU size (specifically pertaining to the MS425-16) under Switching > Configure > Switch Settings:

MTUConfiguration.png

 

MS425 Overview and Specifications 


Note:
The maximum MTU on MS425 is 9416 bytes. Meraki Dashboard already includes 22 Bytes for Ethernet Headers and Frame Check Sequence (FCS). Therefore, use MTU value 9394 when configuring from the Dashboard.

 

2. STP bridge priority set accordingly in your network also under Switching > Configure > Switch Settings.  Typically the MS425-16 would have a lower bridge priority than the access layer switches.  

STP_Bridge_Priority.png

 

If both of these settings are OK then you'd want to run other checks as mentioned by @alemabrahao and @DarrenOC .

 

I have also found the MS425-16 to perform (function) better on MS17.2.2 firmware than previous MS17.x releases.  Running the latest MS firmware would be the 3rd item you might give a quick to check.  If you aren't running MS17.2.2 you can upgrade the firmware and evaluate if things improve, you can always roll back to a previous firmware within 14 days (without Meraki Support).

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

HELLO

MS425 is configure L3 ROUTING but the parameter bridge priority is 8192

cmr
Kind of a big deal
Kind of a big deal

@thierryncho1986 that will be fine, the main thing is that it has the lowest value and is the root.  If you go to the summary page for the MS425 and look down the left hand side do you see This switch, or a MAC address like below:

cmr_0-1753386334552.png

 

If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

yes I see this option is the same one that is on my switch

cmr
Kind of a big deal
Kind of a big deal

Does it say 'This switch' or list a Mac address?

If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

HELLo

for the DNS, should I keep my internal DNS or take Google's?

Mloraditch
Kind of a big deal
Kind of a big deal

As a troubleshooting step, you can certainly test to see if one vs the other causes any difference in speed. That could point to something being problematic on your internal servers.

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

thierryncho1986_0-1754669183523.jpeg

 

thierryncho1986_1-1754669215591.jpeg

 

RWelch
Kind of a big deal
Kind of a big deal

Is the MS425-16 configured for L3 routing or L2 switching?

 

If you were to selct the MS425-16 switch using the new switch view version, can select Device Health to see if System Resources give you insight to any issues/problem.

DeviceHealth_SystemResources.png

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

thierryncho1986_0-1754669333107.jpegthierryncho1986_1-1754669352519.jpeg

 

RWelch
Kind of a big deal
Kind of a big deal

I would definitely change your MTU size.  You can see the blue highlighted note in the document specifically pertaining to the MS425.

Switch Settings 

Note: Some Devices Datasheet mentions MTU values including 22 Bytes for Ethernet Headers and Frame Check Sequence (FCS). Therefore, make sure you check and enter the correct value according to your switch.

Cisco Meraki Examples:

  • The maximum MTU on MS425 is 9416 bytes (i.e.: 9394 + 22 = 9416). Therefore, enter MTU size 9394.
  • The maximum MTU on MS390/C9300/C9300X/C9300L switches is 9198 bytes (i.e.: 9176 + 22= 9198). Therefore, enter MTU size 9176.
If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

FINALY i choose mtu 9 416

PhilipDAth
Kind of a big deal
Kind of a big deal

If you take a notebook and plug directly into your firewall - does it get the expected performance?

thierryncho1986
Here to help

When I connect to the firewall, there is no slowness
but i connect to the core switch there is a lot of slowness

RWelch
Kind of a big deal
Kind of a big deal

What does your topology page show (any loops)?  Can you share a pic of the topology?

What does your routing table show?  Can you share a pic of our routing table?

Can you share how you have L3 configured?  What is the next hop?  Are you using a transit VLAN?  And part of the L3 setup would include verifying your management IP addressing is set correctly.  What VLAN is your management VLAN?

 

Are your SFP or SFP+ or CAT6 uplink and downlinks the same (all 1Gig, all 10Gig, etc)?

 

Your post indicates MS425 core switches.  Are they separate core switches or stacked?  If separate, does the STP Bridge priority for the second or third one have a HIGHER bridge priority than the main root bridge?

If you found this post helpful, please give it Kudos. If my answer solves your problem please click Accept as Solution so others can benefit from it.
thierryncho1986
Here to help

attached is the topology,

the switch is in l3 mode, the core switches are stacked, the sfp cables are 10g they interconnect the switch cores, switch distributions and the fw

thierryncho1986_0-1754665689462.jpeg

 

Get notified when there are additional replies to this discussion.