SFP disables on MS225 when it restarts

NonProfit
Conversationalist

SFP disables on MS225 when it restarts

I have various MS225 switches that are connected to a Cisco 9300 via Multimode Fiber SC connections.  They use GLC-SX-MMD 1Gb transceivers on each switch.  The 9300 is using the same on it's TenGigabit Interface, a 1Gb GLC-SX-MMD.  When we perform an firmware update on the Meraki's or when the Meraki's shut down due to a long term power outage, then I have to reset the GigTen ports on the 9300 for the 2 switches (9300 to MS225) to communicate.

 

The logs on the 9300 state a Dec 26 18:54:24: %UDLD-4-UDLD_PORT_DISABLED: UDLD disabled interface Te1/1/4, aggressive mode failure detected
Dec 26 18:54:24: %PM-4-ERR_DISABLE: udld error detected on Te1/1/4, putting Te1/1/4 in err-disable state.

 

Any ideas?  Do I have a bad SFP or cable?  We have tried changing them with newer fiber cables and SFP but they still do this when I have an upgrade or power outage.  Otherwise, once they work, after multiple port reset, they stay on permanently or until they get restarted.

 

Thoughts?

5 Replies 5
PhilipDAth
Kind of a big deal
Kind of a big deal

I don't know the answer.  I have two thoughts.

 

Try enabling SFP enforcement on the MS225 under switch port settings.

1.PNG

 

If that doesn't do it then disable UDLD on the 9300 port.

redsector
Head in the Cloud

I have this issue sometimes when the MS225 are in a stack and connected with an aggregat to CIsco classic switches. I put an auto shut on error command on the cisco classic switches and after two minutes (120 sec) it´s doing a recovery then everything works well.

 

Port-config:

 

interface TenGigabitEthernet2/0/18
description to_ switch_18
switchport trunk allowed vlan 1,8,12,16,20,24,26,28,30,50,52,86,90,300,301
switchport mode trunk
udld port aggressive
channel-group 11 mode active
spanning-tree guard loop

 

global config for port-recovering:

 

port-channel load-balance src-dst-ip
errdisable recovery cause udld
errdisable recovery cause bpduguard
errdisable recovery cause security-violation
errdisable recovery cause channel-misconfig
errdisable recovery cause pagp-flap
errdisable recovery cause dtp-flap
errdisable recovery cause link-flap
errdisable recovery cause sfp-config-mismatch
errdisable recovery cause gbic-invalid
errdisable recovery cause l2ptguard
errdisable recovery cause psecure-violation
errdisable recovery cause port-mode-failure
errdisable recovery cause dhcp-rate-limit
errdisable recovery cause pppoe-ia-rate-limit
errdisable recovery cause mac-limit
errdisable recovery cause vmps
errdisable recovery cause storm-control
errdisable recovery cause inline-power
errdisable recovery cause arp-inspection
errdisable recovery cause link-monitor-failure
errdisable recovery cause oam-remote-failure
errdisable recovery cause loopback
errdisable recovery cause psp
errdisable recovery interval 120
license boot level ipbasek9
diagnostic bootup level minimal

GIdenJoe
Kind of a big deal
Kind of a big deal

There is a difference in how Catalyst switches enforce UDLD than the Meraki MS switches.

 

It is possible that when the port comes up after your reboot the Meraki switch is possibly waiting too long to begin transmitting it's own UDLD hello's and the Catalyst has already had it's multiple timeouts or the MS is transmitting UDLD hellos but not using the correct identifier.  A SPAN capture on the catalyst side (when it operates normally and when your MS had a reboot) could show more info.

According to the Cisco documentation the default udld message interval is 15 seconds and the link is deemed unidirectional and shut down after 3x this so 45 seconds.

So if you can't live with the 30 seconds wait time before the errdisable recovery I could suggest following workaround.
Put the catalyst switch on normal mode instead of aggressive and put the MS on enforce mode.

As I can read from the Meraki documentation an MS switch still checks for udld messages after the expiry so the Catalyst side would not block the link but the MS would block all other traffic but still listen for udld so it can re-enable the port at the first receipt of an udld message.

NonProfit
Conversationalist

Hello redsector,

 

This has been happening on Meraki's 225 on a stack and on a single switch.  It happens both whenever we do an update to the firmware or when we reboot it.  I have increased the UDLD errdisable interval and this didn't work.  I went into the 9300 and disabled the UDLD and then this stays working when I update it or reboot the switch.  I will open a ticket with TAC for the 9300 to see what their thoughts are.

BjornS
New here

Hello,

 

any news from TAC?

 

we have the same issue, going from a MS225 to a C9500 stack with an aggregated link on the TenGig fibers. and both the ports of the aggregate link are going down in err-disable.

 

Also we have it on a MX100 uplink port to an C2950 on the UTP link where the port is going in err-disable.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels