cancel
Showing results for 
Search instead for 
Did you mean: 

MS-350 stack member drops pings

Comes here often

MS-350 stack member drops pings

I replaced our cisco core / access switches with meraki MS-350s (core) and MS-225. The MS-225s are a 5 stack and a 6 stack of 48 port for access switches. The MS-350s are a 6 stack using 10Ge fiber uplinks to the access stacks and netapp/vcenter.  The copper and fiber port channels to the netapp had major issues and would frequently go into designated > disabled and back to disabled > designated - this caused major network outages. I have since removed all port channels to the netapp /vcenter which has resolved that particular issue. I have noticed that a switch in the core stack drops 66% of pings - just one switch , all the others are fine - the core stack is on a /28 subnet with all switches using the same dns and default route. I replaced the switch that was dropping pings and now a different switch in the stack drops pings but the original one responds fine.  I also see in the logs the core stack ports are doing the designated > disabled  disabled > designated dance all day and night - even switch ports that are disabled - I also see a disabled port being flagged as backup > designated designated > backup - all these issues started when I replaced the cisco core - it was fine before that - anyone experience similar issues?

38 REPLIES
Meraki Employee

Re: MS-350 stack member drops pings

What version of code are you running?

Comes here often

Re: MS-350 stack member drops pings

9.32

Kind of a big deal

Re: MS-350 stack member drops pings

There are several bugs relating to stacks of more than 4 or 5 switches.  Is there any chance you could try dropping back to smaller stacks of 4 switches?

Comes here often

Re: MS-350 stack member drops pings

I could probably work that out - looking at the installation guide it claims you can stack up to 8 switches - is that not really true at this point? My access stacks have 5 and 6 physical switches each....

Kind of a big deal

Re: MS-350 stack member drops pings

Correct, that is not really true at this point.

 

So far I have been limiting stacks to being 4 high and had zero issues.  5 might be ok ... not sure.

 

 

Either that or you could try the "beta" 10.x software.  I have not tried this.

Comes here often

Re: MS-350 stack member drops pings

I've had meraki support look at this several times...they never mentioned the 4 stack issue. Good times. Thanks for the information, it certainly would explain the craziness we are experiencing.

Here to help

Re: MS-350 stack member drops pings

This is a critical functionality, they should tell us these kind of things before putting them in production, or deploy an alert when trying to put 5 switches or 6 in the stack. This is not nice from Meraki.

Kind of a big deal

Re: MS-350 stack member drops pings

It was related to stacking bugs. It may be fixed in the really new code but I would prefer not to rely on that.

But here is a question for you ; what is to be gained by having a massive stack of say 8 switches versus 2 stacks of 4 switches?
Here to help

Re: MS-350 stack member drops pings

I agree on that splitting, 2 x 4 better than 8, but if it is supporting 8 on paper they should support it in the deployment, otherwise and without enough information this could end up on service affection.

Comes here often

Re: MS-350 stack member drops pings

We have 6 because we have fibered up netapp and vcenter services that needed 20GB port channels - so when I remove 2 switches I lose 8 10Ge ports from the stack - they tout these as enterprise switches - if it truly is a buggy stack connectivity issue they should say it. In the general information there is a small print about supporting 160GB stack - so that would be 4 switches. 

Comes here often

Re: MS-350 stack member drops pings

I have always stacked up 6 or more switches for large IDF or smaller distribution centers - not with meraki though - always used cisco, my current case I needed higher density of both fiber and copper port channels for high bandwidth services - we had 6 cisco switches stacked up using port channels with no issues for several years.  Obviously had meraki simply stated 4 switch maximum I would have gone a different route, but they didn't so I did not anticipate any issue with a small stack of 6 switches. I still can't get meraki to admit there's an issue with using more than 6 switches so now we're moving everything back to the old cisco gear as we're all getting tired of network crashes for no apparent reason and with no evidence in the event log to support what happened.

Comes here often

Re: MS-350 stack member drops pings

STP considerations also - the port channel 10Ge interfaces would be on different physical stacks introducing unnecessary complications into what should be a straight forward design - several reasons I'm not really gonna go into - the main thing is I have a stack well within the specs that is behaving like a 1st round protoype and now I get to spend yet another saturday moving our services back to the equipment I've spent a considerable amount of time and effort supposedly upgrading.

Comes here often

Re: MS-350 stack member drops pings

I've had meraki support look at the setup several times - not a word on the 4 switch issue - this is a small uncomplicated setup - it should have been so easy to move these services to a new stack - I've never seen a stack behave this poorly - I don't want to move back to the cisco gear but meraki won't say there's an issue with using 6 switches in a stack and I'm unwilling to make a production network into a lab to test what they should have already resolved. I also have an access stack of 5 and a stack of 6 switches for the building clients on a couple of floors - why would I split a basic stack into 2 x 3 when it should work just fine? Keep in mind we replaced the exact same number of stacks that were performing great for many years - very annoying.

Comes here often

Re: MS-350 stack member drops pings

Also, if I was going to use dual stacks I would use switches that support rpvst and have true load balancing and redundancy - we didn't go that route so that we could simplify.....apparently that was a bad decision!

 

Thanks for all the replies - meraki support still hasn't confirmed the 4 switch limit issue, regardless I have to stabilize the network so we'll be going back to the cisco gear for now.

Kind of a big deal

Re: MS-350 stack member drops pings

Did you work with a Cisco Meraki partner on the design for this network before it got deployed, or did you do the design yourself based on what you already had?

I think your issues relates to the design chosen (model of switches chosen, quantities and stack design).

 

If you were my customer, I would have recommended using the classic collapsed core/distribution layer, and a separate access layer. This design is documented here (although Meraki calls it aggregation rather than distribution, but same thing).

https://meraki.cisco.com/lib/pdf/meraki_campus_deployment_guide.pdf

 

For the collapsed core/distribution I would have used a stacked pair of MS425 switches (which only have 10Gbe ports).

https://meraki.cisco.com/products/switches/ms425-16

From what you describe, I would then plug all the storage and servers into this.  If you had a lot of servers/storage I would deploy another separate pair of MS425's for a dedicated server access layer, but it doesn't sound like you have enough to make this worthwhile.

 

Then for the access layer I would have used MS225's, and formed 10Gbe Etherchannels back to the core switches.

https://meraki.cisco.com/products/switches/ms225-48

I would have limited the stack sizes to 4 but preferred a smaller stack size of 3 were possible, which each stack having its own 10Gbe Etherchannel uplink.  I would have used a limit of 4 because my prior experience tells me this is rock solid reliable, and because I would like to limit the over subscription rate of the upstream links:

  • a stack of 3 x 48 ports with dual 10Gbe uplinks is 144Gb into 20Gb for a 7:1 oversubscription
  • a stack of 4 x 48 ports with dual 10Gbe uplinks is 192Gb into 20Gb for a 10:1 oversubscription
  • a stack of 6 x 48 ports with dual 10Gbe uplinks is 288Gb into 20Gb for a 14:1 oversubscription

You can see two stacks of 3 switches will deliver twice the performance out of the access layer as a single stack of 6 switches.  You can see how this design discourages you from stacking high, because you limit performance - and the cost is the same - so why would you?

 

I would also have given you a guarantee that the deployment would be rock solid reliable with no performance issues.

Kind of a big deal

Re: MS-350 stack member drops pings

ps. You also don't need RPVST in this design - because every link forwards traffic (so you don't need VLAN load balancing), and the design is hierarchical (so the network looks the same to every VLAN).

Meraki Employee

Re: MS-350 stack member drops pings

MS 9.32 + MS350 do not have any defects related to stack size. Based on the symptoms, I suggest an upgrade to 10.9 which is our latest beta as it includes multiple stability enhancements. If there is an open support case, can you send me a direct message with the support case number?

Comes here often

Re: MS-350 stack member drops pings

meraki doesn't support rpvst - what i commented with is why I would not have chosen meraki for a dual core - 

Comes here often

Re: MS-350 stack member drops pings

we're not having bandwidth issues - I do have 225 stacks 20GB port channelled to the core - we don't have all fiber netapp connectivity so 425's not an option. 

Comes here often

Re: MS-350 stack member drops pings

sure, you can second guess the design all day - believe me we're all sorry now that we went the meraki route instead of cisco - I'm not asking much from these switches - a small number of 20GB port channels for redundancy, not bandwidth. We went over our goal at length with meraki - very simple network design - if i wanted to complicate things I would have gone a different route. I see switch ports reporting as disabled > designated on ports that have been physically disabled - I guarantee I could put in cisco 37xx or 38xx with the same design and it would work fine - maybe I've overlooked something, if I did it can't be found yet - my next step will be to physically verify each port and where it's connected - I'm not confident this will yield any results but it will have to be done apparently.  I admit my background is large scale datacenter design but it's not like I just threw this together because its a small simple network.

Kind of a big deal

Re: MS-350 stack member drops pings

Meraki supports standards based RVST.  RPVST is a Cisco Enterprise proprietary protocol, so you wont see anything supporting that outside of the Cisco Enterprise line up.

Comes here often

Re: MS-350 stack member drops pings

exactly - that was my point.  Sorry - when I say go a different route I mean cisco.

Comes here often

Re: MS-350 stack member drops pings

I did drop back to 4 switches in the 350-24 stack this morning - I'll comment when I see what happens next. Thanks

Kind of a big deal

Re: MS-350 stack member drops pings

That's a good plan.  Then you can quickly eliminate stack size as being the issue or not.

 

After that I would be tempted to try @Kapil's suggestion of using firmware 10.9.

Kind of a big deal

Re: MS-350 stack member drops pings

ps, with regard to your command about the "160GB" stacking size - this is very much a marketing number used by Cisco Meraki and Cisco Enterprise.

 

In this case, each switch has a pair of 40 Gb/s full duplex ports.  So each stack port has an aggregate of 80Gb/s, so the pair of ports has an aggregate of 160Gb/s.

Kind of a big deal

Re: MS-350 stack member drops pings

I was thinking about your comment "would frequently go into designated > disabled and back to disabled > designated".

 

This almost sounds like the LACP channel had not formed properly. as the individual LACP members should not do this.  It also sounds like something is not right with the port connectivity.

 

Also, what kind of 10Gbe connectivity are you using to the NetApp (and to VCentre)?  TwinAx?  10GBaseSR?  Perhaps there is a more basic connectivity issue we have overlooked when considering only more complex potential faults.

Highlighted
Kind of a big deal

Re: MS-350 stack member drops pings

Sorry to pepper you with so many questions.  Back to the original title "MS-350 STACK MEMBER DROPS PINGS" - not that you have changed the stack to only having four members, does this issue still occur (high ping loss from a stack member)?

Comes here often

Re: MS-350 stack member drops pings


PhilipDAth wrote:

Sorry to pepper you with so many questions.  Back to the original title "MS-350 STACK MEMBER DROPS PINGS" - not that you have changed the stack to only having four members, does this issue still occur (high ping loss from a stack member)?


No worries,

 

The access switches are 10Ge fiber to the 350-24s - still have the dropped ping issues on a single switch in the stack and I'm still seeing switch ports that are physically disabled reporting events of designated > disabled and disabled > designated, I am not seeing these  events on the 4 stack however, only on the access switches which are 5 and 6 stacks. When the stack was 6 deep even the meraki aggregates were  having issues - I split all port channels between access and core which did appear to resolve the issue - I agree it appears lacp is having issues - I can understand issues between cisco and meraki lacp if MST isn't enabled on the cisco side but there just isn't anything I can do about a meraki lacp issue - other then split apart and re-create the aggregate.  Meraki has informed me there is not a bug using up to 8 switches in a stack. I really don't want to split the access stacks up but if the core stack doesn't have any issues today and this week I guess I'll have to consider doing that....

Kind of a big deal

Re: MS-350 stack member drops pings

LACP and MST are not related to each other.

 

Cisco Meraki and Cisco Enterprise switches should form LACP channels without issue.

 

Once LACP is configured (on either side) spanning tree then runs on the aggregation interface, and not the individual link members.

Comes here often

Re: MS-350 stack member drops pings


PhilipDAth wrote:

LACP and MST are not related to each other.

 

Cisco Meraki and Cisco Enterprise switches should form LACP channels without issue.

 

Once LACP is configured (on either side) spanning tree then runs on the aggregation interface, and not the individual link members.


MST (multiple spanning tree) is pretty much what meraki supports. So, they're related when you have a mixed cisco and meraki network. Otherwise I would have to have vlan 1 across my entire network and not allow the meraki core to be the stp root bridge for the network, which I want it to be. pvstp+ will also force meraki to standard stp mode, which I also don't want.  I understand lacp and mst are not related logically but I have seen issues with a pvstp stack port channeled to a meraki stack.  

Comes here often

Re: MS-350 stack member drops pings

Is there an admin that can remove this thread - it's not relevant and apparently nobody else is having the same issues I'm seeing. I'll repost a different question with a more granular topic.

Meraki Employee

Re: MS-350 stack member drops pings

@BadOscar I think this is a relevant topic. Has there been a support case opened where support has reproduced this with you? 

Comes here often

Re: MS-350 stack member drops pings


DCooper wrote:

@BadOscar I think this is a relevant topic. Has there been a support case opened where support has reproduced this with you? 


There is - last I heard it was getting lab'd up several weeks ago.  We've been moving our critical services back to a cisco stack - this weekend we are moving the rest of our netapp / vmware stuff so that the network doesn't keep crashing.  Once that's done we'll see if things stabilize as before moving to the meraki core we weren't having these issues. I had the access stacks connected to the old cisco core for months without any issues so I don't think its the stack size and removing 2 switches from the meraki core did nothing but move the ping drop issue to a new switch. My question did not get answered so it seems that nobody else is experiencing these problems so it pointless to waste someone's time trying to figure out an issue with going through this thread.

Meraki Employee

Re: MS-350 stack member drops pings

@BadOscarI am not convinced your the only customer with these issues. Let me see how I can help on the backend, can you PM me the case number? Also, most likely what your seeing when losing pings is ARP entries disappearing, when this happens take a look at your ARP tables and see if the entry for the client your trying to ping has disappeared. If it has try to ping the device from another L3 segment and if it is successful see if the ARP entry re-appears. 

Comes here often

Re: MS-350 stack member drops pings


DCooper wrote:

@BadOscarI am not convinced your the only customer with these issues. Let me see how I can help on the backend, can you PM me the case number? Also, most likely what your seeing when losing pings is ARP entries disappearing, when this happens take a look at your ARP tables and see if the entry for the client your trying to ping has disappeared. If it has try to ping the device from another L3 segment and if it is successful see if the ARP entry re-appears. 


@DCooper - I did that and the arp entry for my wrkstn is there - I'm not pinging a client, I'm pinging a stacked switch member - all the switches in the access stacks reply as expected - there are 2 switches in the core stack that either don't respond at all or have 66% packet loss....like clockwork. Other switches in the same stack and subnet respond fine - these results are consistent across 3 vlans I've tested from. I've been checking my routes and interfaces to make sure I didn't fat finger something but it looks fine - and really if it was that none of the switches on the subnet would respond. Thanks

Meraki Employee

Re: MS-350 stack member drops pings

So to re-iterate, this is only happening pinging to the switch stack ips? The 66% packet loss is not affecting production traffic traversing the switch stack?

Meraki Employee

Re: MS-350 stack member drops pings

Can you PM me the support case number?
Comes here often

Re: MS-350 stack member drops pings


DCooper wrote:

So to re-iterate, this is only happening pinging to the switch stack ips? The 66% packet loss is not affecting production traffic traversing the switch stack?


Yup - the layer 3 interfaces all rspond fine - its the physical ip address of the switch in the core stack - we do experience network hesitations throughout the day that I traced to lacp issues betweena cisco 3850 stack and the meraki stack - still need to try and figure that one out, resolved it temporarily by splitting the port channel off and using a trunked non port channel uplink.  I've been working with someone from your place, we're kinda slammed over here so I haven't been updating much - I can send the support case..there's a very long thread to it!