I'm experiensing this same issue. We just deployed 36 MS390's and have had 20%-40% ping loss on the management IP's. The larger the stack, the worse the ping loss apears. We are also experiencing issues with stacks losing access to the dashboard as described above.
Has anyone seen any improvement if all L3 was moved off the MS390's? I have a stack of 4 acting as a distribution/CORE. However I have Palo Alto NGFW's that I could move all L3 functionality to. That way all stacks will be L2 only, and the "CORE" becomes a distribution switch only.
We have been considering this move anyway, if doing so could help mitigate some of these issues then it will be easier to push through change management.