I am running into an issue that is affecting multiple MS120-48FP switches and I can't wrap my head around why it is happening.
Layout
I have a MS120-48FP (FW v12.28) switch connected to an MX65W (FW v15.42). The switch is trunk ported using SFP Copper modules on ports 51 & 52 which are connected to ports 3 & 4 on the MX65W. These are not aggregate ports because the MX65W doesn't support it and the switch is automatically blocking the connection on port 52 due to STP (Expected). I have 7 PoE phones with HP Thin Clients daisy-chained to them with roughly 25 clients actually connected to the MS. These devices are setup with 802.1x so we use a Multi-Domain access policy for the clients.
Issue
I installed the switch and everything was working great for about a month. I had to make a change in the configuration so I did and when the change was updated, I saw the switch jump up to 1000+ms, then drop pings for about 20 pings, then it would come back up and latency would go down. I made another change, just enabling a disabled port, and it did it again, but the interesting thing was that the switch lost connectivity via ICMP pings, but devices connected to the switch were still pinging. This time however, the change caused the switch to do a "soft reboot." the switch's fan will speed up, the switch would go to an orange LED, and PoE phones would loose power, but the switchports will still showing their status lights as active. When the switch came back up, it was still showing a extremely high latency and up to 70%-80% packet loss. This was like this until the switch was physically rebooted. I even went down to a single uplink on a integrated copper port (46), made a change, and it happened again!
Support Response
I was working with support and they deemed it a Power Supply failure because on the back end, the issue they were seeing was that the switch soft rebooted due to power failure. They issues an RMA and I though it was fine. however I received another switch, set it up, could make changes, and everything looked good. Until a week later when I tried to make another change (disabling unused ports) and the same thing happened again on the new switch! exact same thing. Support is claiming it is another Power supply issue, but I am starting to wonder if it is something else.
Has anyone else seen this before with switches?
Thanks!
I have not seen this myself, but it seems you can reproduce it. I would ask/demand the issue be escalated. Especially since you can easily reproduce the issue that is a perfectly reasonable request.
@KRobert I would also upgrade the firmware, we had issues with 12.28 (in stacks) and found 12.33 much better. We now run 14.x on all of our switches and it seems fine.
@KRobert does it still cause issues for devices connected to the switch, or just some timeouts in pings to the management IP?
This 100% sounds like normal spanning tree to me.
Change to having only a single connection to the MX65 and the issue should go away.
Thanks @KRobert great update. We do have some MS120s but have only used them in areas like an AV closet etc. as we always go with redundant power for anything critical (therefore needing an MS210 at minimum). I guess that's why we didn't see these issues.
My bet is that the CPU/RAM in the MS12x switches is not as highly specified as the MS2xx+ switches, but I'd have thought that the developers can do something with process scheduling to avoid the spikes that cause the issues.