LACP faulted between Win2019 server and MS225 Stack

RIGeek
Comes here often

LACP faulted between Win2019 server and MS225 Stack

I've been dealing with this issue for a while and have opened a ticket with both MS and Meraki but no resolve yet.

 

I have a Win 2019 Server that is a HyperV host. It has 4 NICs that I've created in to a pair of teamed interfaces. Meraki side has disabled one of each link and the Windows side has the teamed adapters as faulted.

 

From all packet captures, the Meraki is doing what it should. I can do packet captures to see that it is but the Windows host appears to not be sending LACP packets on one side of each team. 

 

Anyone else run into this before? 

 

Team settings on the Windows host are:

 

Name : MGMT-NIC-Team01
Members : {MGMT-NIC02, MGMT-NIC01}
TeamNics : MGMT-NIC-Team01
TeamingMode : Lacp
LoadBalancingAlgorithm : Dynamic
LacpTimer : Fast
Status : Degraded

 

Name : VM-NIC-Team01
Members : {VM-NIC01, VM-NIC02}
TeamNics : VM-NIC-Team01
TeamingMode : Lacp
LoadBalancingAlgorithm : Dynamic
LacpTimer : Slow
Status : Degraded

11 REPLIES 11
PhilipDAth
Kind of a big deal
Kind of a big deal

Are the physical interfaces (when up) all reporting the same speed as duplex?

 

Is this going into a single Meraki switch or a stack?

All NICs are linked at 1Gb/Full. This is a stack. The Teams are as follows:

 

Team1 = NIC1+2

Team2 = NIC3+4

 

NIC1+2 connect to SwitchStackMember1 and SwitchStackMember2 port 1

NIC3+4 connect to SwitchStackMember1 and SwitchStackMember2 port 2

 

The stacks are connected stack port 1 to 2 and then 2 back to 1. They show healthy as well.

 

I've not only physically traced the wires to be 100% sure that they're not mixed up in any way also by disabling each adapter on the Windows side and on the Meraki side.

 

As you can also see in my original post, the LACP negotiation is set to slow on one team and fast on the other team. I was unable to find what Meraki would prefer so I set each team to a different setting, hoping that one will connect.

GIdenJoe
Kind of a big deal
Kind of a big deal

Your lacp timers are wrong.
You should always use slow timers in lacp on Cisco products.

RIGeek
Comes here often

One of the teams is set to slow, the other is set to fast. Neither will link.

GIdenJoe
Kind of a big deal
Kind of a big deal

OK, then you'll need to verify if the NIC team is sending LACP packets back to the switch.
Please run a packetcapture on the Meraki switch on any port of the bundle.
The capture filter you'll need is:
ether host 01:80:c2:00:00:02

Normally you should see lacp packets from both the source MAC addresses ( switch and server ).
If one side is not transmitting any then the problem lies there.

cmr
Kind of a big deal
Kind of a big deal

@RIGeek are you using Microsoft's teaming or the NIC vendor's software.

 

We are totally VMware now but used to use Broadcom NICs and their BASP software always worked well for teaming.

 

For those who don't know, NIC teaming in Windows was first done by Madge Networks back in the mid 1990s and the code was bought by Microsoft.  It's about all that is left of the $1Bn company that was Madge Networks!

RIGeek
Comes here often

This is an HPE server and using the Server 2019 teaming. I looked for a 3rd party app for this server but none found.

RIGeek
Comes here often

So for each team, one of the ports, I can see bidirectional LACP packets, on the other, the windows host is not sending LACP packets. I've opened a case with MS and they still claim this is a Meraki issue. While I feel it's not, might anyone else here have had a similar issue?

GIdenJoe
Kind of a big deal
Kind of a big deal

Each member port of a bundle must send LACP information where the actor System ID is identical between both ports.
If one member fails it will not be added to the bundle.

LACP-Meraki.png

In above pictures I have taken an lacp packet capture between a meraki core switch stack (where two different stack members are downlinked in the bundle) and the access switch which is a single switch where port 49 and 50 are uplinked and bundled in a port-channel.
So on each side both ports send LACPDU's with the same ACTOR ID but a different ACTOR PORT ID for it's own system and the peer ACTOR of the other side.

If you capture both switchports leading to the server you have 100% proof that the Windows host is not behaving if you cannot see any LACPDU's on the second port.

I hope this helps.

RIGeek
Comes here often

I completely understand and agree that this is the Windows host. I'm just hoping that someone else in this group might have experienced a similar issue. The Meraki devices, I can see in the packet captures are doing what they need to. 

PhilipDAth
Kind of a big deal
Kind of a big deal

It sounds like something on the Windows side has a bug.

 

Are you using the latest NIC driver?

Are your NICs using the latest firmware?

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels