cancel
Showing results for 
Search instead for 
Did you mean: 

MS-350 stack member drops pings

Here to help

MS-350 stack member drops pings

I replaced our cisco core / access switches with meraki MS-350s (core) and MS-225. The MS-225s are a 5 stack and a 6 stack of 48 port for access switches. The MS-350s are a 6 stack using 10Ge fiber uplinks to the access stacks and netapp/vcenter.  The copper and fiber port channels to the netapp had major issues and would frequently go into designated > disabled and back to disabled > designated - this caused major network outages. I have since removed all port channels to the netapp /vcenter which has resolved that particular issue. I have noticed that a switch in the core stack drops 66% of pings - just one switch , all the others are fine - the core stack is on a /28 subnet with all switches using the same dns and default route. I replaced the switch that was dropping pings and now a different switch in the stack drops pings but the original one responds fine.  I also see in the logs the core stack ports are doing the designated > disabled  disabled > designated dance all day and night - even switch ports that are disabled - I also see a disabled port being flagged as backup > designated designated > backup - all these issues started when I replaced the cisco core - it was fine before that - anyone experience similar issues?

54 REPLIES
Meraki Employee

Re: MS-350 stack member drops pings

What version of code are you running?

Here to help

Re: MS-350 stack member drops pings

9.32

Kind of a big deal

Re: MS-350 stack member drops pings

There are several bugs relating to stacks of more than 4 or 5 switches.  Is there any chance you could try dropping back to smaller stacks of 4 switches?

Here to help

Re: MS-350 stack member drops pings

I could probably work that out - looking at the installation guide it claims you can stack up to 8 switches - is that not really true at this point? My access stacks have 5 and 6 physical switches each....

Kind of a big deal

Re: MS-350 stack member drops pings

Correct, that is not really true at this point.

 

So far I have been limiting stacks to being 4 high and had zero issues.  5 might be ok ... not sure.

 

 

Either that or you could try the "beta" 10.x software.  I have not tried this.

Here to help

Re: MS-350 stack member drops pings

I've had meraki support look at this several times...they never mentioned the 4 stack issue. Good times. Thanks for the information, it certainly would explain the craziness we are experiencing.

Getting noticed

Re: MS-350 stack member drops pings

This is a critical functionality, they should tell us these kind of things before putting them in production, or deploy an alert when trying to put 5 switches or 6 in the stack. This is not nice from Meraki.

Kind of a big deal

Re: MS-350 stack member drops pings

It was related to stacking bugs. It may be fixed in the really new code but I would prefer not to rely on that.

But here is a question for you ; what is to be gained by having a massive stack of say 8 switches versus 2 stacks of 4 switches?
Getting noticed

Re: MS-350 stack member drops pings

I agree on that splitting, 2 x 4 better than 8, but if it is supporting 8 on paper they should support it in the deployment, otherwise and without enough information this could end up on service affection.

Here to help

Re: MS-350 stack member drops pings

We have 6 because we have fibered up netapp and vcenter services that needed 20GB port channels - so when I remove 2 switches I lose 8 10Ge ports from the stack - they tout these as enterprise switches - if it truly is a buggy stack connectivity issue they should say it. In the general information there is a small print about supporting 160GB stack - so that would be 4 switches. 

Here to help

Re: MS-350 stack member drops pings

I have always stacked up 6 or more switches for large IDF or smaller distribution centers - not with meraki though - always used cisco, my current case I needed higher density of both fiber and copper port channels for high bandwidth services - we had 6 cisco switches stacked up using port channels with no issues for several years.  Obviously had meraki simply stated 4 switch maximum I would have gone a different route, but they didn't so I did not anticipate any issue with a small stack of 6 switches. I still can't get meraki to admit there's an issue with using more than 6 switches so now we're moving everything back to the old cisco gear as we're all getting tired of network crashes for no apparent reason and with no evidence in the event log to support what happened.

Here to help

Re: MS-350 stack member drops pings

STP considerations also - the port channel 10Ge interfaces would be on different physical stacks introducing unnecessary complications into what should be a straight forward design - several reasons I'm not really gonna go into - the main thing is I have a stack well within the specs that is behaving like a 1st round protoype and now I get to spend yet another saturday moving our services back to the equipment I've spent a considerable amount of time and effort supposedly upgrading.

Here to help

Re: MS-350 stack member drops pings

I've had meraki support look at the setup several times - not a word on the 4 switch issue - this is a small uncomplicated setup - it should have been so easy to move these services to a new stack - I've never seen a stack behave this poorly - I don't want to move back to the cisco gear but meraki won't say there's an issue with using 6 switches in a stack and I'm unwilling to make a production network into a lab to test what they should have already resolved. I also have an access stack of 5 and a stack of 6 switches for the building clients on a couple of floors - why would I split a basic stack into 2 x 3 when it should work just fine? Keep in mind we replaced the exact same number of stacks that were performing great for many years - very annoying.

Here to help

Re: MS-350 stack member drops pings

Also, if I was going to use dual stacks I would use switches that support rpvst and have true load balancing and redundancy - we didn't go that route so that we could simplify.....apparently that was a bad decision!

 

Thanks for all the replies - meraki support still hasn't confirmed the 4 switch limit issue, regardless I have to stabilize the network so we'll be going back to the cisco gear for now.

Kind of a big deal

Re: MS-350 stack member drops pings

Did you work with a Cisco Meraki partner on the design for this network before it got deployed, or did you do the design yourself based on what you already had?

I think your issues relates to the design chosen (model of switches chosen, quantities and stack design).

 

If you were my customer, I would have recommended using the classic collapsed core/distribution layer, and a separate access layer. This design is documented here (although Meraki calls it aggregation rather than distribution, but same thing).

https://meraki.cisco.com/lib/pdf/meraki_campus_deployment_guide.pdf

 

For the collapsed core/distribution I would have used a stacked pair of MS425 switches (which only have 10Gbe ports).

https://meraki.cisco.com/products/switches/ms425-16

From what you describe, I would then plug all the storage and servers into this.  If you had a lot of servers/storage I would deploy another separate pair of MS425's for a dedicated server access layer, but it doesn't sound like you have enough to make this worthwhile.

 

Then for the access layer I would have used MS225's, and formed 10Gbe Etherchannels back to the core switches.

https://meraki.cisco.com/products/switches/ms225-48

I would have limited the stack sizes to 4 but preferred a smaller stack size of 3 were possible, which each stack having its own 10Gbe Etherchannel uplink.  I would have used a limit of 4 because my prior experience tells me this is rock solid reliable, and because I would like to limit the over subscription rate of the upstream links:

  • a stack of 3 x 48 ports with dual 10Gbe uplinks is 144Gb into 20Gb for a 7:1 oversubscription
  • a stack of 4 x 48 ports with dual 10Gbe uplinks is 192Gb into 20Gb for a 10:1 oversubscription
  • a stack of 6 x 48 ports with dual 10Gbe uplinks is 288Gb into 20Gb for a 14:1 oversubscription

You can see two stacks of 3 switches will deliver twice the performance out of the access layer as a single stack of 6 switches.  You can see how this design discourages you from stacking high, because you limit performance - and the cost is the same - so why would you?

 

I would also have given you a guarantee that the deployment would be rock solid reliable with no performance issues.

Kind of a big deal

Re: MS-350 stack member drops pings

ps. You also don't need RPVST in this design - because every link forwards traffic (so you don't need VLAN load balancing), and the design is hierarchical (so the network looks the same to every VLAN).

Meraki Employee

Re: MS-350 stack member drops pings

MS 9.32 + MS350 do not have any defects related to stack size. Based on the symptoms, I suggest an upgrade to 10.9 which is our latest beta as it includes multiple stability enhancements. If there is an open support case, can you send me a direct message with the support case number?

Here to help

Re: MS-350 stack member drops pings

meraki doesn't support rpvst - what i commented with is why I would not have chosen meraki for a dual core - 

Here to help

Re: MS-350 stack member drops pings

we're not having bandwidth issues - I do have 225 stacks 20GB port channelled to the core - we don't have all fiber netapp connectivity so 425's not an option. 

Here to help

Re: MS-350 stack member drops pings

sure, you can second guess the design all day - believe me we're all sorry now that we went the meraki route instead of cisco - I'm not asking much from these switches - a small number of 20GB port channels for redundancy, not bandwidth. We went over our goal at length with meraki - very simple network design - if i wanted to complicate things I would have gone a different route. I see switch ports reporting as disabled > designated on ports that have been physically disabled - I guarantee I could put in cisco 37xx or 38xx with the same design and it would work fine - maybe I've overlooked something, if I did it can't be found yet - my next step will be to physically verify each port and where it's connected - I'm not confident this will yield any results but it will have to be done apparently.  I admit my background is large scale datacenter design but it's not like I just threw this together because its a small simple network.

Kind of a big deal

Re: MS-350 stack member drops pings

Meraki supports standards based RVST.  RPVST is a Cisco Enterprise proprietary protocol, so you wont see anything supporting that outside of the Cisco Enterprise line up.

Here to help

Re: MS-350 stack member drops pings

exactly - that was my point.  Sorry - when I say go a different route I mean cisco.

Here to help

Re: MS-350 stack member drops pings

I did drop back to 4 switches in the 350-24 stack this morning - I'll comment when I see what happens next. Thanks

Kind of a big deal

Re: MS-350 stack member drops pings

That's a good plan.  Then you can quickly eliminate stack size as being the issue or not.

 

After that I would be tempted to try @Kapil's suggestion of using firmware 10.9.

Kind of a big deal

Re: MS-350 stack member drops pings

ps, with regard to your command about the "160GB" stacking size - this is very much a marketing number used by Cisco Meraki and Cisco Enterprise.

 

In this case, each switch has a pair of 40 Gb/s full duplex ports.  So each stack port has an aggregate of 80Gb/s, so the pair of ports has an aggregate of 160Gb/s.

Kind of a big deal

Re: MS-350 stack member drops pings

I was thinking about your comment "would frequently go into designated > disabled and back to disabled > designated".

 

This almost sounds like the LACP channel had not formed properly. as the individual LACP members should not do this.  It also sounds like something is not right with the port connectivity.

 

Also, what kind of 10Gbe connectivity are you using to the NetApp (and to VCentre)?  TwinAx?  10GBaseSR?  Perhaps there is a more basic connectivity issue we have overlooked when considering only more complex potential faults.

Kind of a big deal

Re: MS-350 stack member drops pings

Sorry to pepper you with so many questions.  Back to the original title "MS-350 STACK MEMBER DROPS PINGS" - not that you have changed the stack to only having four members, does this issue still occur (high ping loss from a stack member)?

Here to help

Re: MS-350 stack member drops pings


PhilipDAth wrote:

Sorry to pepper you with so many questions.  Back to the original title "MS-350 STACK MEMBER DROPS PINGS" - not that you have changed the stack to only having four members, does this issue still occur (high ping loss from a stack member)?


No worries,

 

The access switches are 10Ge fiber to the 350-24s - still have the dropped ping issues on a single switch in the stack and I'm still seeing switch ports that are physically disabled reporting events of designated > disabled and disabled > designated, I am not seeing these  events on the 4 stack however, only on the access switches which are 5 and 6 stacks. When the stack was 6 deep even the meraki aggregates were  having issues - I split all port channels between access and core which did appear to resolve the issue - I agree it appears lacp is having issues - I can understand issues between cisco and meraki lacp if MST isn't enabled on the cisco side but there just isn't anything I can do about a meraki lacp issue - other then split apart and re-create the aggregate.  Meraki has informed me there is not a bug using up to 8 switches in a stack. I really don't want to split the access stacks up but if the core stack doesn't have any issues today and this week I guess I'll have to consider doing that....

Kind of a big deal

Re: MS-350 stack member drops pings

LACP and MST are not related to each other.

 

Cisco Meraki and Cisco Enterprise switches should form LACP channels without issue.

 

Once LACP is configured (on either side) spanning tree then runs on the aggregation interface, and not the individual link members.

Here to help

Re: MS-350 stack member drops pings


PhilipDAth wrote:

LACP and MST are not related to each other.

 

Cisco Meraki and Cisco Enterprise switches should form LACP channels without issue.

 

Once LACP is configured (on either side) spanning tree then runs on the aggregation interface, and not the individual link members.


MST (multiple spanning tree) is pretty much what meraki supports. So, they're related when you have a mixed cisco and meraki network. Otherwise I would have to have vlan 1 across my entire network and not allow the meraki core to be the stp root bridge for the network, which I want it to be. pvstp+ will also force meraki to standard stp mode, which I also don't want.  I understand lacp and mst are not related logically but I have seen issues with a pvstp stack port channeled to a meraki stack.  

Here to help

Re: MS-350 stack member drops pings

Is there an admin that can remove this thread - it's not relevant and apparently nobody else is having the same issues I'm seeing. I'll repost a different question with a more granular topic.

Highlighted
Meraki Employee

Re: MS-350 stack member drops pings

@BadOscar I think this is a relevant topic. Has there been a support case opened where support has reproduced this with you? 

Here to help

Re: MS-350 stack member drops pings


DCooper wrote:

@BadOscar I think this is a relevant topic. Has there been a support case opened where support has reproduced this with you? 


There is - last I heard it was getting lab'd up several weeks ago.  We've been moving our critical services back to a cisco stack - this weekend we are moving the rest of our netapp / vmware stuff so that the network doesn't keep crashing.  Once that's done we'll see if things stabilize as before moving to the meraki core we weren't having these issues. I had the access stacks connected to the old cisco core for months without any issues so I don't think its the stack size and removing 2 switches from the meraki core did nothing but move the ping drop issue to a new switch. My question did not get answered so it seems that nobody else is experiencing these problems so it pointless to waste someone's time trying to figure out an issue with going through this thread.

Meraki Employee

Re: MS-350 stack member drops pings

@BadOscarI am not convinced your the only customer with these issues. Let me see how I can help on the backend, can you PM me the case number? Also, most likely what your seeing when losing pings is ARP entries disappearing, when this happens take a look at your ARP tables and see if the entry for the client your trying to ping has disappeared. If it has try to ping the device from another L3 segment and if it is successful see if the ARP entry re-appears. 

Here to help

Re: MS-350 stack member drops pings


DCooper wrote:

@BadOscarI am not convinced your the only customer with these issues. Let me see how I can help on the backend, can you PM me the case number? Also, most likely what your seeing when losing pings is ARP entries disappearing, when this happens take a look at your ARP tables and see if the entry for the client your trying to ping has disappeared. If it has try to ping the device from another L3 segment and if it is successful see if the ARP entry re-appears. 


@DCooper - I did that and the arp entry for my wrkstn is there - I'm not pinging a client, I'm pinging a stacked switch member - all the switches in the access stacks reply as expected - there are 2 switches in the core stack that either don't respond at all or have 66% packet loss....like clockwork. Other switches in the same stack and subnet respond fine - these results are consistent across 3 vlans I've tested from. I've been checking my routes and interfaces to make sure I didn't fat finger something but it looks fine - and really if it was that none of the switches on the subnet would respond. Thanks

Meraki Employee

Re: MS-350 stack member drops pings

So to re-iterate, this is only happening pinging to the switch stack ips? The 66% packet loss is not affecting production traffic traversing the switch stack?

Meraki Employee

Re: MS-350 stack member drops pings

Can you PM me the support case number?
Here to help

Re: MS-350 stack member drops pings


DCooper wrote:

So to re-iterate, this is only happening pinging to the switch stack ips? The 66% packet loss is not affecting production traffic traversing the switch stack?


Yup - the layer 3 interfaces all rspond fine - its the physical ip address of the switch in the core stack - we do experience network hesitations throughout the day that I traced to lacp issues betweena cisco 3850 stack and the meraki stack - still need to try and figure that one out, resolved it temporarily by splitting the port channel off and using a trunked non port channel uplink.  I've been working with someone from your place, we're kinda slammed over here so I haven't been updating much - I can send the support case..there's a very long thread to it!

Here to help

Re: MS-350 stack member drops pings

update - 

 

I was able to test by using the (2) ms350 switches I removed form my core - they were stable with port channels to a cisco 3750 stack right up until I configured them as stacked - as soon as the stack was configured in the cloud the port channels failed and would not come back until they were unstacked in the cloud - then the port channels came right up with no problem. The meraki stacking procedure was followed when creating the ms350 stack - the event log threw the usual root.designated designated > root events during this.....

Kind of a big deal

Re: MS-350 stack member drops pings

These port channels were from the individual stack members to the 3750?

Kind of a big deal

Re: MS-350 stack member drops pings

As a matter of interest, what model 3750 do you have and what software version are you running on it? I'll compare it against deployments I have.

Here to help

Re: MS-350 stack member drops pings

The 3750 stack had 2 port from switch1 and 2 ports from switch2 - the ms350 side I used 1 switch with 4 ports as stand alone and as stacked so the ms350 side didn't change except for being stacked -  the code should be 12.2.55...I can verify next week as I don't have these switches on the prod network right now

Kind of a big deal

Re: MS-350 stack member drops pings

I'm doing an installation next week to a stack of WS-C3750X-48P-S running 15.2(4)E2.  I'll let you know how I go.

 

ps. 12.2.55 is really old.  That version doesn't even support lacp fastrate.  It looks like that software came out in 2010!

  The 12.2(55) trains seems to be up to a patch level of SE12.

 

Can you run the current maintenance pack of SE12 - or better still,  could try a newer version of software?  The current "gold star" version for 3750X's (as recommend by Cisco) is 15.0.2-SE11.

Kind of a big deal

Re: MS-350 stack member drops pings

Yucky.  The release notes for 12.2(55) mention bugs and limitations to do with Etherchannel LACP and spanning tree. Search for "LACP" and  "Etherchannel".

https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3750x_3560x/software/release/12-2_55_se/r...

 

I think you really need to get off that software version.

Here to help

Re: MS-350 stack member drops pings

The 3750 stack had 2 port from switch1 and 2 ports from switch2 - the ms350 side I used 1 switch with 4 ports as stand alone and as stacked so the ms350 side didn't change except for being stacked -  the code should be 12.2.55...I can verify next week as I don't have these switches on the prod network right now


PhilipDAth wrote:

These port channels were from the individual stack members to the 3750?


 

Here to help

Re: MS-350 stack member drops pings

the same issue with the 3850 stacks also.....these switches ran lacp between stacks and netapp for many years.... not saying there aren't issues but they have many years of successful connectivity with no issues....

Kind of a big deal

Re: MS-350 stack member drops pings

Ok.  I'll try and remember to come back next week with an update on how I got on.

Here to help

Re: MS-350 stack member drops pings

It's possible they left the code where it was for the netapp maybe? Not sure, I may have updated the code on the test stack, don't recall - I didn't setup any access to those switches from the lab network or i would check now! 

Here to help

Re: MS-350 stack member drops pings

Sounds good - I would think that if a lacp issue with the software was causing this it would be a problem with the unstacked switch also?....i wish the 350's had the option of using the 40GB ports as aggregates and not stack only....guess I could put 40GB sfp's in the 350s but i don't see why I should have to! 

Community News

View all community news »

Labels