MX105 locked up after scheduled update to 17.10.2 completed
The title sums it up. It required a power cycle to restore service and seems fine now after monitoring for an hour or so. Nothing useful in the event logs. When I say, "locked up" I mean more specifically that the WAN port was not passing traffic apparently because it went offline in the dashboard and the end customer reported "no internet".
I didn't get a chance to troubleshoot any further or see lights, etc. because the end customer power cycled themselves after verifying the ISP connection was fine.
I have a case open with support, but wanted to ask here if anyone else has had the problem or knows of a bug that may be related?
We are running (nearly) all of our MXs on 17.10.2 and haven't seen the same issue as you, however we have seen something similar where one WAN connection stops passing traffic randomly and a power cycle of the box is required to get it back. We have only seen it twice and on each occasion the box has had two WAN connections (nearly all of ours do) so it has been possible to do remotely. In both cases we have had, it has been on an MX84 (I think) and it only happened once on each box.
Thanks for sharing that. This MX has only one WAN uplink and the only thing of note is that the ISP handoff of 1G copper was patched to a 2.5G port on the MX originally. For troubleshooting purposes while trying to isolate the source of consistent blips of ~1% packet loss seen in dashboard we moved to a copper SFP a few weeks ago. It was on the copper SFP when this happened today.
Based on your experience it makes me think it likely was "just" the WAN port stopped functioning somehow and if I had the chance I suspect the console port and local status page would have worked.
Support just told me this via email, "..after taking a look at the logs from the Meraki side it appears that the device failed to boot properly for the new firmware. At current it looks like the device is still on the old firmware of 16.16 on the backend. What we may need to do is try to schedule or force another upgrade and make sure it makes it all the way through the process."
I never came across that before, but agree it is not good and hopefully acknowledged by Cisco as a bug. I could easily have left it alone and figured it was on 17 not knowing better and then had that cause some issue later.