self healing CRC errors on MS350

Solved
Dirk-WB
New here

self healing CRC errors on MS350

Hi all,

 

we run a meraki network with 52 MS3350 access switchs and 2 MS425 cores for LACP uplink to our data center.

From time to time (every 2-3 month) we see a lot of CRC errors on all MS350 uplink ports connectes to Core1. The switches connected to Core2 do not show errors, so network keeps running. Core1 does not show any issue.

 

Most of time we reboot Core1 and everything is fine again. Yesterday it happened during night and in the late morning everything was ok again without any interaction from our side.

 

When the issue occurs we see a weird topology plan, so RSTP is calculating something multiple times. All switches are up to date and a case with meraki was opened every time it happen without any solution right now.

 

Does someone know such a behavior? We did not make any change to the network when it happened and we cannot reproduce it. We assume it happens when Core1 takes master role of stack, but we cannot force that for a test.

 

Cheers

Dirk

1 Accepted Solution
JacekJ
Building a reputation

Sorry, but no, I'm only a customer.

I don't think my comments are of any value for the support team, since this is just my guestimating based on experience 😉

You know, this could be a faulty SFP module, or fiber connection, but if this sudden CRC problem only happens on specific ports and more than one at a time, from time to time, without any specific reason, then the problem most likely is in the switch - where else could it be?

If I'm not missing anything in this story, then I would use the above argument and ask for a replacement, especially when this is a recurring problem and they can't find the root cause.

Fingers crossed!

View solution in original post

4 Replies 4
JacekJ
Building a reputation

What you will hear from the support, is that as soon as you reboot the switch, they will lose all logs inside the switch, that would be of more value than what you can see on the dashboard or via syslog.

If Meraki can't find any configuration issue causing this (LACP not configured properly on both sides and so on), then to be honest, I would demand a replacement of Core1 since the issue seems to be around that device.

I think I saw something like this on one MS425, the CRC errors were seen by the switch connected to it, but not on the MS425 side, ended up with replacing it if I recall correctly.

Dirk-WB
New here

Hi JacekJ,

 

thanks for your answer. I also see a replacement as possible solution.

Are you a meraki official? Can I post your ideas to the case?

 

Actually we haven't rebooted Core1 as the CRCs went away for unknown reason, but support hasn't found something on Core1 yet. We haven't catched logs locally yet.

 

I don't thing its a config probem as it was/is running fine for weeks.

JacekJ
Building a reputation

Sorry, but no, I'm only a customer.

I don't think my comments are of any value for the support team, since this is just my guestimating based on experience 😉

You know, this could be a faulty SFP module, or fiber connection, but if this sudden CRC problem only happens on specific ports and more than one at a time, from time to time, without any specific reason, then the problem most likely is in the switch - where else could it be?

If I'm not missing anything in this story, then I would use the above argument and ask for a replacement, especially when this is a recurring problem and they can't find the root cause.

Fingers crossed!

Dirk-WB
New here

Thanks again, that are also our thoughts, we will go that way.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels