Cloud Monitoring for Catalyst dashboard join troubleshooting

SOLVED
RomanMD
Building a reputation

Cloud Monitoring for Catalyst dashboard join troubleshooting

I do see the need of a complete troubleshooting section where everyone will describe their problems of using this apparently nice feature but not sure if ready enough.

 

So, let's start. 

Trying to add to the dashboard two stacks: 9500 (stackwise-virtual) and 9300 (backplane). 

Switches running 17.3.4 and 17.3.2a.

Using the MacOS standalone app. 

Both switches have internet connectivity and onboarding checklist verified. 

Both switches have dna-advantage licenses.

 

Intermittently one switch is failing complaining about the certificate. Mostly the 9300 but not always.

RomanMD_1-1655380165492.png

 

Going further with the one that passed pre-check and reaching the last step with the message bellow, but the switch doesn't join the dashboard.

 

RomanMD_0-1655379708577.png

On the upstream firewall do see UDP/33xxx traffic towards compute.amazonaws.com.

 

Really interested if anyone tried the feature already and if it worked for them. 

 

1 ACCEPTED SOLUTION
RomanMD
Building a reputation

Mystery solved! 

The problem was related to tacacs authentication. The standalone app creates a local user: meraki-user which is used to authenticate via the tls-tunnel, and a bunch of aaa commands. But my switches already were configured for tacacs+ in a way we are using for authentication and it seems did not take every possible configuration in the account.

Now it is fixed after I've spent some time with support.

View solution in original post

21 REPLIES 21
Jeff-L
Meraki Employee
Meraki Employee

Could you open a support case and include the log file? We will need to look into this situation in a bit more detail based on the specifics of the situation to determine next steps.

RomanMD
Building a reputation

This is what I have done, but I don't have very good experience with support on new features and now I have the same feeling.

From very few posts here on community, I can see that each try is a "special" case that has to be troubleshooted individually, which is suggesting that the process is not even beta, but in alfa stage. 

It is clearly a feature that many of us are interested in, so would be nice if we can see some fast fixes.

PhilipDAth
Kind of a big deal
Kind of a big deal

I've probably remembered this wrong - but wasn't the minimum IOS-XE version 17.7 or something like that?

RomanMD
Building a reputation

docs are saying 17.3.x - 17.7.x

PhilipDAth
Kind of a big deal
Kind of a big deal

IOS-XE is pretty famous for its bugs.  I would try moving to a newer gold star release.

For management 17.8.1 and onwards. But dont .. Just dont. I think I bricked my 9300 doing it today.

For monitor from 17.3.x and onwards. And that works fine. Except there seems to be some trouble if you dont have any switches yet in your combined network because it has to be se to "Unique Client Identifier - Beta" and thats an "MX" thing that you can only set if you also have switches already in a network (as far as I can see) - Then everything works (For Monitor).

> "Unique Client Identifier - Beta" and thats an "MX" thing that you can only set if you also have switches already in a network

 

Specifically, layer 3 switches that are doing routing.

Copy/pasting this from my response to Thomas in another thread: it shouldn't be necessary to manually enable Track By Unique Client Identifier for Cloud Monitoring, as it is enabled as part of the onboarding process.

Cloud Management is not yet available (still in EFT). Do not attempt to migrate your switches to full cloud management unless you are directly engaged with the Meraki PM team as part of the EFT group.

So I have just bricked my 9300 ... dammit 😕

Can I even boot it in ROMMON to load IOS back onto it ?

If you create a ticket with Meraki support, the team can help you get this fixed.

Well so far support does not seem to know what Im talking about. 

But Im off to bed perhaps I will have more luck tomorrow.

*sigh* When can we expect this to be available ? 

RomanMD
Building a reputation

Yes, I've seen your comments in the other thread and I went ahead and added an MS390 to the network to set the Unique Client Identifier, however, now I am stuck at the pre-check with "Device is not eligible for onboarding. Reason: Device SUDI was not found" on both switches, and I can't get out of it...

Strange I have not had any problems with the "tool" and the pre-checks.

My only problem, right now, is that i have converted a Cat9300 to managed mode, when I thought this was ok (since people from Meraki seems to write about it on Linked-In).

But it seems that it was not ok. And now I have a Cat9300 that I cannot use to anything 😕

Support is not helping, so its kinda up-hill.

 

I would have liked to try monitor out a bit more. But I only have that one Cat9300 available in my lab (that is now a doorstop).

RomanMD
Building a reputation

Still not working to add the device for monitoring. 

By checking the logs, everything seems to be correct and the application queries for an device import job id which does not seem to be promising. 

{
  "results": {
    "connectionState": "NOT_CONNECTED",
    "capabilitiesState": "PENDING",
    "configState": "UNKNOWN"
  }
}

 

Minutes later, I have queried the same endpoint and got this status, but the device still does not show in dashboard.

 

{'results': {'capabilitiesState': 'DIRECT_STARTED',
             'configState': 'UNKNOWN',
             'connectionState': 'CONNECTED'}}

 

  

Giving up for today.

RomanMD
Building a reputation

New day has arrive but no lights in the end of the tunnel.

The job from yesterday, todays return this beautiful information....

{'error': 'timed out',
 'results': {'capabilitiesState': 'DIRECT_STARTED',
             'configState': 'UNKNOWN',
             'connectionState': 'CONNECTED',
             'status': False}}

 

 

RomanMD
Building a reputation

Mystery solved! 

The problem was related to tacacs authentication. The standalone app creates a local user: meraki-user which is used to authenticate via the tls-tunnel, and a bunch of aaa commands. But my switches already were configured for tacacs+ in a way we are using for authentication and it seems did not take every possible configuration in the account.

Now it is fixed after I've spent some time with support.

2nd update:  one of my 9k's is working / shows up in dashboard, the other doesn't.  same AAA config, no errors in AAA authentication / authorization.  Unfortunately no way to troubleshoot.  I see a non stop set of commands getting (successfully) authorized on my switch debugs but this switch never shows up in Meraki dashboard.

 

1st update:

 

I added "meraki-user" into ISE and ensure that user is getting priv-15 / full command authorization and it looks like the onboarding into Meraki cloud is now working.  It looks like root issue is the added Meraki Cloud AAA config enables local authentication for the meraki-user connection to the switch but doesn't enable local authorization for the same; so the meraki-user user locally authenticates but then the switch attempts to use the (existing, in my case) AAA TACACS authorization config for meraki-user, which fails unless you have this user in your AAA server etc (in my case, Cisco ISE).

 

---

 

 

 

hmm ok I'm probably having the same problem.  Both my 9300's (2 standalone 9k's running 17.6.3) show successfully onboarded in the app but never show up in the cloud.  I did have a  TACACS config running, and I see the new Meraki AAA config, but how to resolve now?

Hi jefanell 

How did you add the "meraki-user" to ISE, without specifying the password? 

Regards 
A.Foerby

I believe I am running into the same issue, but after stripping my AAA down to use local only I am still getting Login Authentication failed for Meraki User.

 

Jun 30 09:39:33 CDT: %SEC_LOGIN-4-LOGIN_FAILED: Login failed [user: meraki-user] [Source: xx.yyy.zzz.www] [localport: 2222] [Reason: Login Authentication Failed] at 09:39:33 CDT Thu Jun 30 2022

 

Do you happen to know the commands support gave you for resolving this?

RomanMD
Building a reputation

Well, support did not give me any commands.. but just put me on the right path. 

1. your switch should already have aaa new-model in the config, otherwise it will not work.

Then inspect your AAA commands and the AAA commands that the application will configure on the switch. 

The application will configure new VTY Lines and add the authentication group MERAKI:

line vty 32 33
access-class MERAKI_VTY_IN in
access-class MERAKI_VTY_OUT out
authorization exec MERAKI
login authentication MERAKI
rotary 50
transport input ssh

 

Then it will add some aaa to authenticate locally: 

aaa authentication login MERAKI local
aaa authorization exec default local 
aaa authorization exec MERAKI local 

 

And if you have the aaa authorization config-commands then this config does not allow meraki-user to authorize, and therefore you must have a special rule to allow "meraki-user" to get privilege level 15.

aaa authorization commands 0 default group tacacs+ local
aaa authorization commands 1 default group tacacs+ local
aaa authorization commands 15 default group tacacs+ local

 

In my case, it was an authorisation problem. In your case it seems the authentication is not working. Do you have the aaa new-model enabled already?

 

Thank RomanMD,

Notice some switch no need but some L3 switch required above command before onboard.
Thank you,

M.MAKARA
Get notified when there are additional replies to this discussion.