VPN stops passing traffic between Meraki Security Appliances and Cisco ASAv devices

Gord719
Here to help

VPN stops passing traffic between Meraki Security Appliances and Cisco ASAv devices

WE have a situation where we manage site to site vpns between Meraki devices and Cisco ASA devices. WE can establish a site to site VPN fine but after a undetermined / random amount of time the tunnel will stop passing traffic and we have to force a rekey on the ASA side or force the vpn down and back up on the Meraki portal side but shutting VPN settings off and turning the back on. 

 

WE have been back and forth with support for both ends, set recommended ph1 and ph2 timeouts, disabled dpd and other misc settings but the issue remains. WE always attempt to be the on the latest firmware on both ends.

 

I am out of ideas. 

 

The strange thing is that the tunnel in the portal shows the green "up" icon and on the asa side it will still show "active" but no traffic will pass until you reset/rekey to force the tunnel reset. 

 

Looking for recommendations, ideas or feedback.

70 REPLIES 70
anthonypapaleo
Conversationalist

Is ASAv running 8.3 code or above? Meraki has an issue building tunnels to ASA's below this code level. 

yes. 9.8.2 at this time. 

gwermter
Conversationalist

Hmm, we've seen similar issues with ASAv 9.5.x versions that were resolved by upgrades to 9.6.3 and later.

 

What we find is that duplicate IPSEC SAs are being created when they shouldn't be.  The bug can be confirmed on the ASA by running "show crypto ipsec sa inactive" and looking for an inactive tunnel.  Performing "clear crypto ipsec sa inactive" on the ASA is a workaround.  My understanding is that 9.8.x versions were unaffected.

 

Interesting. We do have a couple older asas running 9.5.2 and had locked up VPNs over the past few days. I checked "sh crypto ipsec sa inactive" and it came back with 0.  I must have a different issue here. Good feedback though. I hadnt heard that one before. 

CIreland
Conversationalist

Hi, we had the same issue here.  It turned out to be a problem with the timeouts and NAT-T.
We ended up with phase 1 28800, phase 2 14400 and Meraki support disabled NAT-T (this was a configuration override that only support can do)  for that endpoint it has been stable for us since.

AlZ
Conversationalist

Hi, has this problem come back?  We are having this same issue.  I'm hoping that this is the fix.

 

Thanks

Al

Rob2
Here to help

Thanks everyone for the help.  I should also add that I opened a case with Meraki support and they disabled NAT traversal on the tunnel by changing some backend settings on the MX that we do not have access to.  That seems to have helped significantly - we have not had the tunnel go down in over a month at this point after making all the changes in this thread and having Meraki disable NAT-T.  We also adjusted the timeouts to 86400 for phase 1 and 28800 for phase 2.

 

I am still not convinced that the issue is resolved, but there is no question that things are much improved over where they were with the default settings.

 

 

Hi,

 

We have exactly the same issue with MX-64's and MX-100 connecting to a 3rd party Juniper firewall and the only issue we can think, including Meraki support is either the ESP window or the fact that Meraki have Anti-Replay protection enabled.  We had an issue with some ASA's connecting to the same Juniper in which we had to disable Anti-Replay as the Juniper is sending out of sequence packets.

 

Support cannot find out the issue as the tunnel is up but packets just drop, our phase 1 is 28800 with phase 2 3600.  My concern is I am about to return these devices because of this issue which I do not want to do.

 

Any help would be appreciated 

 

Thanks

 

Keith

We have the same issues and have had a case opened opened for months. We finally had a break through. With firmware 15.7 Meraki changed the anti replay value from 4 to 32. Juniper has a default value of 64. We have requested that this be a configurable value either to the end user or the Support staff. After applying the beta code all has been smooth. We are still working out a few Dead Peer detection issues, on lesser used subnets.

which devices? I show most of my devices are 13.33 and considered current via the portal. The beta firmwares i show only go up to 14.xx. No 15 options at all for firmware versions.

we had to have support push it as we did not have the option available also.

OHTorx, are you referring to v14.13 changelog entry?

"Non-Meraki and client VPN traffic may be dropped when packets arrive out-of-order due to an overly restrictive anti-replay window size"

Which appears to be fixed in 14.26 changelog:

  • Fixed an issue where non-Meraki and client VPN traffic would be dropped when packets arrived out-of-order due to an overly restrictive anti-replay window size

 

If so, have you experienced any other issues on this beta firmware?

We tried 14.26 and it did not work. Only 15.7 seemed to fix it as that is when they made the reply value 32. We have not seen any other issues since applying
NordOps
Getting noticed

Thanks for this info, I noticed you referenced your case below. I gave that to support since they haven't been of much help. What I've noticed (Site to Site to vShield VMware firewall) with third party VPN
- Internet on both sides up
- Third party firewall is in a private cloud with multiple carriers
- Meraki Auto-VPN's don't drop
- I can kick the tunnel off by booting the Meraki
- It recovers on its own after about 8 minutes
- Event logs always have "Non-Meraki / Client VPN negotiation msg: phase1 negotiation failed due to time up" but nothing really useful.

Thanks for referencing your case I hope that helps and they dig something up

Hi everyone,


I've been having some major issues with a Meraki MX80's VPN to one site previously running a Cisco 89x series and now a Ubiquiti EdgeRouter ER8-Pro. 

 

MX80 is on firmware 13.28. IPSEC has 3DES/SHA1 with lifetime of 86400 for both Phase 1 and 2.

 

What I've found is that if a change is made in the site-to-site VPN settings - such as adding/removing a subnet on any of the peers - the Meraki closes ALL tunnels and recreates them. When this happens, certain types of traffic stop passing through the tunnel to this site. For all intents and purposes the tunnel is up, however not everything works. 

 

At the Cisco/Ubiquiti end, this manifests as failed authentication attempts to domain controllers, file shares stop working etc. The only way to fix it is to restart IPSEC on the Cisco/Ubiquiti end. I can recreate this like clockwork by simply making a change to one of the peers on the Meraki console. Within a few seconds, the tunnels drop and recreate fine but with only some of my traffic passing through. 

 

 

Tonight I've had a breakthrough. By adjusting the MSS down to a conservative 1300 on all interfaces, the problem has magically gone away. As soon as I made the change, traffic started flowing freely. I didn't need to restart IPSEC, it literally just came good. I then made 10+ changes to the Meraki peer console to try and force it to break, and each time the tunnel would drop, recreate and resume normal operation. 


Obviously it's too early for me to say whether this has completely resolved it, but I thought it worth sharing as I've tried almost everything else and hopefully it points someone in the right direction.  

 

EdgeOS Commands : 

set firewall options mss-clamp interface-type all
set firewall options mss-clamp mss 1300

akan33
Building a reputation

I am still struggling with an issue between Meraki MX and ASA since last October 🙂 Cisco and Meraki are engaged and although we keep trying things the root cause has not been found, we have performed live sessions and troubleshooting.

 

last thing was that Meraki was changing the WAN IP randomly between the firewall physical interface and the VIP, breaking up tunnels randomly too, we were on 13.28 and they recommended upgrade to 14.30 (Beta), we just found that the issue persists so we are really running out of ideas and it is very frustrating. 

 

good luck!

@akan33  Have you talked with support about beta 15.7 and the change it made to the Anti-replay value from 4 to 32?  I do not know ASA, is it possible to change it's value down to 4 to match the Meraki's pre-15.7 value?

Hey guys,

 

 

Just jumping in to say that, assuming the issue is related to the anti-replay value as @OHTorx is advising,you should be able to change the anti-replay window size on the ASA side:

https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/sec_conn_dplane/configuration/15-mt/sec-ipsec-data...

 

Hopefully this might be a less disruptive test than a firmware upgrade . 

 

Keep in mind that the 15.X release is currently on unreleased Beta and we are using it for customers who face particular issues that are resolved in that specific release, which is why you are unable to schedule upgrades to it manually.  

 

Hope this helps.

 

Giacomo

Please keep in mind that what I post here is my personal knowledge and opinion. Don't take anything I say for the Holy Grail, but try and see!
Appreciate who helps and be respectful of every opinion and every solution offered.
Share the love, especially the Meraki one!

How exactly would increasing the default value of 64 to say 1024 on the ASA side have any affect?  Seems the Meraki has the lower value and as such is not able to 'queue'/pass traffic once it sees the first 'bad' sequence number.  Shouldn't the value need to be changed on the Meraki end only?

 

Please advise. 

akan33
Building a reputation

almost 1 year now with this ongoing issue, having to manually reset the tunnel every now and then, many conversations with Cisco and Meraki engineers, upgraded ASA to 9.4.4 as supposedly was a buggy behavior on Cisco, and nothing, still the same. I am sorry but this is the worst technical experience I have had so far in more than 12 years.

Hey guys,

 

We had another couple of cases regarding this issue. It seems that there's different solutions to this depending on the circumstances. The last case I'm aware about, Cisco TAC was involved and they confirmed a bug on the ASA with Id CSCso70269 . 

 

I would suggest to cast an eye on it and try the same commands to see if you are hitting that bug too.

 

Hope this helps!

 

Giacomo

Please keep in mind that what I post here is my personal knowledge and opinion. Don't take anything I say for the Holy Grail, but try and see!
Appreciate who helps and be respectful of every opinion and every solution offered.
Share the love, especially the Meraki one!

We have been having the same issue with Meraki VPN to SonicWALL or Cisco ASA. The tunnel shows active but cannot communicate to the remote network/s. If we have multiple networks, maybe 1 out of 6 will be accessible. Have to manually renegotiate tunnel as a temporary fix.

 

Three weeks ago, Meraki support tech mentioned this can happen if both sides of the tunnel are enabled for NAT-Traversal. Don't know why it's not an issue with any other manufacturers if both sides are enabled but I have learned that any knowledge I have about Networking and security best practices might not be applicable to Meraki Firewalls. 

 

Anyways, the Meraki tech disabled the NAT-T for the specific VPN tunnel on the back-end, we're going on three weeks without having to manually renegotiate the tunnel. 

akan33
Building a reputation

regarding this bug CSCso70269 we were running 9.1 and now 9.4.4 as the Cisco engineer told us we could be hitting other bugs too, and the bug is on version 8.x. But now on 9.4.4 we had another issue too. 

 

if NAT-T is an issue someone should had raised it to me after 1 year struggling with this ticket and I don't know how many engineers we have had already (if I say 10 different engineers I am not lying).

All I have the exact same issue. And worked with Meraki to resolve:

I have an MX68 on one end and an ASA5525 on the other.  The MX is running MX14.39 Code and the ASA is running 9.8(3)-18 code. I have the Phase 1 and Phase 2 times set exactly the same 86400 and 28800 respectively.  After reading this posting, I went ahaead and called Meraki and had them disable NAT-T on the Meraki.  The also mentioned that they had Dead Peer detection enable, and suggested I enable it on my ASA.  So I went ahead and added " isakmp keepalive threshold 10 retry 5" under the tunnel group ipsec-attributes.  Meraki stated that these setting match the values used on the MX by default.  I just made the change so lets see how long it lasts.  At this point my tunnel would drop almost everyday.  Its Friday now, I should know for sure by Tuesday.  I will reply back once Tuesday comes. 

akan33
Building a reputation

I also configured exactly the same threshold and retry values, we already gave up on this and our tunnel just keep failing after more than 1 year and a half, we engaged Cisco ASA engineers (positive experience as usual) and Meraki (really need to improve) and the issue was never found. All the best for your case. 

UPDATE--

When I had Meraki support make the last change, I haven't had a single issue since. Were going on 2+ weeks with no dropped tunnels, no half tunnels, no issues with SAs.

Great!  Thanks for replying and letting everyone know.

ITPointeMan
Here to help

We're having the same exact issue using a Sonicwall NSA3500 on the hub side. Tried just about everything you did to get some stability to no avail. I can only guess now that it's an issue with Meraki now 😞 

 

Please let us all know if you have had a fix for this issue.

Rob2
Here to help

This same issue has been killing us for almost 2 years.  I have the exact same symptoms you describe on multiple ASA-MX VPN tunnels.  ASA-ASA tunnels, ASA-SonicWall tunnels and MX-MX tunnels are all fine.  Did you ever find a fix?  

gwermter
Conversationalist

Did you ever find a fix?

Sort of.  As I suggested above, we found that the ASA bug was supposed to be resolved in 9.6.3, but in practice we've still had occasional issues through 9.8.1 devices.  We think this is identified in the ASA bug tracker as "Stale VPN Context entries cause ASA to stop encrypting traffic despite fix for CSCup37416 - CSCvb29688."

 

We've successfully mitigated the issue by using the following tunnel settings on both sides:

Phase 1:

  • Enc: 3DES
  • Auth: SHA1
  • DH Group: 2
  • Lifetime: 86400 seconds

Phase 2:

  • Enc: 3DES
  • Auth: SHA1
  • PFS: Off
  • Lifetime: 86400 seconds

It is important that these be the ONLY accepted/offered tunnel parameters.

It is also important that the ASA have NAT Exempt enabled for the tunnel.

Nope.  Same here. Since 2014. 

 

Recent ASA Code 9.8.2 seems to have helped as well as running 12.26 on the MX side but they still go down once and a while. 

 

It used to be as much as multiple times of day to a couple times a week now. ...which is much much better than it has been. 

akan33
Building a reputation

Have anyone found a fix for this scenario?  I still have a random issue between a MX600 and a ASA running 9.1(7)4 , the tunnel remains always up but the traffic stops going through, it is very annoying and it has been around for 2 months now. 

Hi,

 

I still have not had any tunnel outages after performing the steps noted in my post above.  Still not totally convinced everything is fixed, but things are certainly WAY more stable than they were before.  I saw exactly what you saw - tunnel appears to be up on both sides but traffic stops passing.  Only way to fix was to clear the tunnel from the ASA side and allow it to rebuild.  This has not happened at all since disabling NAT-T on the Meraki (must be done by support) and adjusting the timeouts on both sides.  Also make sure to disable the data-based tunnel lifetime on the ASA, although this was never my problem.

akan33
Building a reputation

I see, I have a case opened with meraki but I think they already gave up 😄 I don't have the data lifetime, it is actually working without issues with other ASA, running the same cnfig and version, but I don't know why it fails that much with this other ASA. 

 

I have read above that changing to 3DES also helped someone, I will give it a try although I don't find any relation with the encryption algorithm, at this point I find it quite frustrating. I am escalating in parallel with Meraki again (no answer from them since Dec 19th 🙂 ) to see if NAT-T has something to do with it. I am curious how could that actually help, I thought NAT-T is actually needed having a PAT device in the path. 

 

thank you. 

akan33
Building a reputation

Just for the record, changing to 3DES haven't changed anything, tunnel keeps failing. I have restored AES256 and I have changed lifetime to 28800 seconds (Meraki default) instead of the 86400 seconds that I had before. I don't know if doing rekey earlier would help somehow.

akan33
Building a reputation

Changing to 28800 seconds made the difference, I don't know it it is solved but it looks more stable. Meraki Support doesn't even respond anymore.

akan33
Building a reputation

It seemed more stable, it started to happen again though. Meraki doesn't provide any serious feedback on this one. They asked me whether nat-trasversal is activated in the ASA, it is, I think they are already running out of ideas. I have another ASA in the same building working steady and stable for long time.

Yeah, it's totally ridiculous.  I just had one stop passing traffic this weekend.  When I looked at the ASA side (since you can't see s*** on the Meraki) there were two tunnels up and active - one with the ASA as the initiator and one with it as the responder.  Had to "cl isakmp sa" and everything started working again (but who knows for how long).  Meraki support is just terrible.  Every time I reach out to them I get a tech that can't really help at all.  Cisco TAC they are not.  They certainly act like they know what they are doing, but nothing ever really gets fixed.  They just don't have the knowledge and experience to support the product properly when something unusual goes wrong.

 

It is disappointing that this is even an issue.  I've done SonicWall-ASA tunnels, Watchguard-ASA tunnels, Fortinet-ASA tunnels - all work perfectly.  Meraki is owned by Cisco and they can't create a stable tunnel with the most industry-standard firewall imaginable.  Ridiculous.  So frustrated.  Maybe someone else can stay on support about this and give them a hard time.  I just don't have the time or heart any more.

akan33
Building a reputation

My ticket is opened since December, I have contacted them multiple times, no success at all. Now someone else took the ticket and they keep asking again for the same basic information, what is your config, take capture, etc.  I am sorry to say but this is not a Next generation firewall, unstable tunnels, VRRP HA, no outgoing NAT for other IPs but the WAN interface, terrible Support, etc. My level of frustration with this product is getting really high, very disappointed too.

Jimmy401
Conversationalist

Agreed Akan33 , we are facing the same issue and everytime we call in they have a new fix which works for couple of days and than the same issue. I dont think MX is enterprise level device

akan33
Building a reputation

So it seems there are multiple customers complaining about this, they should take this situation more seriously from my point of view as it is not isolated.

Same problem here. Mainly VPN to Sonicwalls but also Azure and Fortigate VPNs. Only solution is to disable/enable VPN temporarily or move to AutoVPN.

ITofTN
Here to help

I am having the same exact issue between a Meraki MX80 HA Pair and a Watchguard firewall. I have marked this CASE HIGH PRIORITY CRITICAL when I lose this tunnel the entire organization is down.

 

Basically HA and all failover works perfectly and then either at EOL of Phase 2 key or at random the VPN just stops it appears Phase 1 is up and we have verified all settings on both sides, followed Meraki docs to a Watchguard, either side can rekey the tunnel back up and working, but hangs.

 

I am using HA pair setup with Virtual IPs for greatest recovery with two ISPs all cabled the same. with direct heartbeat cable between per Meraki Best Practices.

 

They had me move to 14.20 for an initial HA Pair problem where the STP was not being passed on a security monitoring device, got that resolved was not related to the 14.20 firmware. Went back to the stable release of 13.27 

 

I was on Stable relase 13.27 and at random, the mx would lose its virtual IP and the tunnel would try to establish on the non-virtual IP, of course, it wouldn't work THey beta pushed me up to 14.27 and now I"m back to my original problem.

 

Using std negotiations with phase 1 time to 28800 and phase two time to 14400  everything matches to a tee.  Also have the WatchGuard keep alive off because not supported to non-watchguard, dead peer detection is on.

 

They have captured packets and don't see anything wrong on in the tunnel setup nor settings.  They can't explain why it just stops, but there are over 100 tunnels connection to my application provider without problems and this is only one they are having trouble with, with all different manufacturers. 

 

Meraki is so good at so many things, but some of the most basic things, like this, and then like no logging if they block a country from layer 7 firewall rule.

 

I had the same setup with sonciwall and never had any trouble with the HA or tunnels, but now trouble. They are gathering packets etc and I'm trying to get to engineering but doesn't suppress the heat i'm getting 

 

Has anyone got a resolution. I'm tempted to go back to the sonciwall with this tunnel. I still have it running my Verizon Wireless Private network tunnel because meraki doesn't support address translation on a tunnel or truly support BGP so I can get rid of the translation.

 

ANyone......Car54 Anyone??????? HELP

 

akan33
Building a reputation

marking the case as High priority won't make any different from my experience. 

 

I had my firewall running few months ago on 12.x and they asked me to move it to 13.28, same result. 

 

behavior is like you describe. I have escalated this issue to Cisco ASA engineer, I will keep you posted but I would recommend you to do the same as Meraki is not helping at all on this issue, it is very frustrating (they keep passing the ticket among engineers and I have to explain the same story every time, without any progress).

 

regards.

Thank you.    I refuse to get off the phone nor did they pressure me i'm over 2 hours in right now. The result is parsed packet captures and verified my settings are correct and remote vendor. Eliminated any setup errors on both our parts and have attached screen captures.

 

They verified Dead peer detection is fine and correct. I'm supposedly heading into higher level engineering. 

 

What we are down too is this. 

Describe it this way Site A (ME)  Site B is Watchguard)

 

When this VPN Down event  occurs Site, B tries to send packets to Site A(seen in packet capture),

The Phase one tunnel is up, matter fact I get a green light on meraki, but meraki Phase 2 is actually down,  the green light only shows phase 1. You reboot primary or turn off vpn page turn on, the phase one comes down and immediately everything restarts, and they did both confirm on both sides that dead peer detection is working properly. I'm good to go again, it seems related to phase 2 key lifetime, but not always its random to.

 

We have gathered logs, screenshots, and everything because I don't want this escalated and easily dismissed. This issue thanks to this thread and my gut experience is more than just a simple misconfiguration or setting problem.

 

More to come, Thank you for the replies> its good to know I"m not alone.

 

 

 

 

 

 

 

 

 

akan33
Building a reputation

yeah, Phase 1 remains up, but no SPI are built in the remote end, only resetting the ipsec or bouncing the tunnel works.

 

I am trying to collect some debugging from the ASA to see if Cisco can helps here. 

 

 

@akan33 you are describing the same exact issue I have been troubleshooting for the last 4 months.  View my notes above and how we resolved.  Meraki code 15.7 changed the anti-Replay value from 4 to 32.  That fixed my Meraki to Juniper VPN troubles.  Either upgrade or on the other side have them change this window to = 4 to match your current code's value.

Update... So being here till 11:30pm last night and going through three call centers, and refusing to get off the phone.  They captured all the information they wanted except a down situation. From my colleagues on the other side, they can see that Meraki Support disabled NAT-T on the Meraki side, which is an options we cannot see, and (FINGERS CROSSED) since last night I have not had one hiccup.  We did temporarily remove the secondary endpoint on the watchguard side just for testing, but plan on putting it back if everything goes well today.  I was also very patient, gave the techs time to analyze the captures, cause we all know how it is to work in Tech Service.

 

The WatchGuard guys asked if I wanted anything else changed on their side. I told them not to change anything so on the WatchGuard side we still have Dead Peer Detection  5 tries 20 seconds, no Keep Alive cause thats watchguard to WatchGuard, and NAT-T on, which is on by default on most firewalls now, but apparently NAT-T on meraki might be causing something with Meraki.

 

Keep everyone Posted. For as good as they are in so many areas, this core product needs more work

 

I'll 

@ITofTN you are going through the same steps I did for 4 months.  Have your Meraki guy look up my Case 02390711 and talk to the engineer on it.  From your description the Anti-Replay is the issue.  We did the NAT-T thing also with no success.

Meraki uses "lifetime-kb-unlimited" and there is no way to change this. We had an issue where we were doing MX VPN's to Cisco ASA and this is what was recommended bu Meraki support. I believe this is also why Azure tunnels won't stay connected. You need an ASA running 9.1(2) or higher I believe to use this command.

 

On Cisco ASA you have to specify this in crypto-map:

 

crypto map <map-name> <seq-num> set security-association lifetime kilobytes unlimited

 

T-800

Hi T-800,

This issue is NOT related to the issue with ASA and data-usage lifetime. This is a separate issue with VPNs ceasing to pass traffic on multiple 3rd-party firewall brands which have no data-limit expiration.

Upate I ran all day today, and Meraki Support did not turn off Nat-T it is still one, no drops then had a bip at 5pm and now down from 6pm-9pm no explanatio

 

akan33
Building a reputation

maybe it is related to the anti-replay window size as per above comments, if that's the fix what it would be shocking to me is the fact that I have had my ticket open for months and no engineer has been able to provide any information, and that the 'fix' actually comes so late. In any case, the damage is made. 

UPdate, we have for sure verified and removed NAT-T on both sides.

 

Thanks,
Scott

 

akan33
Building a reputation

from the logs,  I can see this when failing from the ASA:

 

where x.x.x.x is the Meraki remote public IP.

 

 [IKEv1]Group = x.x.x.x, IP = x.x.x.x, QM FSM error (P2 struct &0x00007ffe60d39e40, mess id 0xb107883c)!

 [IKEv1]Group = x.x.x.x, IP = x.x.x.x., Removing peer from correlator table failed, no match!

 [IKEv1]Group = x.x.x.x, IP = x.x.x.x, Session is being torn down. Reason: Phase 2 Mismatch

 

Also, from the debugging, it looks like there's a crypto ACL mismatch, but the ACL that shows the log is actually properly configured in both sides, mirrored. Again, when clearing the tunnel everything starts working fine again. 

 

Cisco pointing to Meraki, but no answer from them. 

 

 

So far we have turned off Nat T on both sides been up 7 days no events 

akan33
Building a reputation

are you NATting in the firewall? I have everything behind NAT, so I wouldn't understand the point of disabling NAT-T as I need to encapsulate in UDP to work with PAT 😕

Yes, we are natted completely behind the firewall.  My understanding the Nat T only effects this site to site Vpn which public side is all real ips.  It's not a global setting so someone trying to get on a Vpn inside my network can.   

 

So far up since last Wednesday no events.  I just added back or watch guard side added back in secondary end point for isp2 and they had to turn on dead peer detection so now click is reset 

Skipdog
Conversationalist

I just wanted to chime in here with a "me too"

 

Merak end: MX84, version 14.40

Cisco end: ASA 5585, version 9.8.4(10)

 

For two weeks i've been having to re-type a PSK on both Meraki and Cisco ASA side to get the tunnel to come back up.  My settings were:

 

IKE Policy: AES 256, SHA1, 86400 lifetime

IPSEC Proposal: AES 256, SHA1, 86400 lifetime

 

Problem:  Tunnel drops right at 18 hours.

 

I had Meraki turn off NAT-T -- this did not fix the issue.

 

I then made the following changes:

 

IKE: 3DES, SHA1, lifetime 3600

IPSEC: 3DES, SHA1, lifetime 3600

NAT-T turned off still

 

Tunnel has been up for 20+ hours with no drop.  I'm assuming its the lifetime values and not the IKE/IPSEC proposals.  At any rate after struggling through this for weeks i'm happy it seems to be working better now.

 

Skip

 

 

Any updates on this? 

 

Did this resolve the issue permanently?

My issues were indeed solved by setting lifetime values to 3600.  Meraki MX to ASA 5585.

 

Skip

Did you have to disable NAT-T on the Meraki as well?

 

I am having the same problem with MX84 and ASA5525. Had the drops for approximately a week now, they happen at random... I see a drop between 5-8 hours daily. We have other Meraki MX products connecting to the same ASA without any issues. I don't know why this particular tunnel does not like the same settings.

 

I told my support tech to disable the NAT-T on the MX84, as we have a no-NAT rule on the ASA. If this does not resolve the problem, I will move to decrease the lifetime of Phase 1 and 2.

 

Then there is a firmware upgrade on the MX. Apparently, I am on a very old firmware - 13.36, i was advised to upgrade to 14.40 I believe...whatever is the latest stable version now.

 

I'll keep you guys posted.

akan33
Building a reputation

I tried all possible options long time ago, and I got Cisco ASA specialist and Meraki "working" in a case during 6 months, we made some little improvements thanks to the Cisco Engineer who was the only one with enough knowledge there, we finally gave up and removed the MX to get back to ASA to ASA. good luck

I did not disable NAT-T.

I just removed the unused algorithms. AES128, MD5, etc.

The only things left are AES256 & SHA1.

Timers are 28800.

No PFS.

 

Also, the firmware on the MX is 14.40.  Not sure what the ASA is though.

 

So far it has been up and stable since my last post.

 

I hope this helps someone.

 



Did anyone ever solved this issue permanently? We're having the exact same issue between a Palo Alto Cloud Firewall and Meraki Z3s on multiple sites. On the rekey-step the tunnel stays online, but network traffic doesn't pass the tunnel. We have an IKEv2 Tunnel btw.

 

On the Meraki site/log, you can see the there are two steps happening repeatedly on a working tunnel.

 

inbound CHILD_SA

outbound CHILD_SA

 

At the time the error occurs, the outbound step is missing.

 

We have a NAT scenario on all sites where the Z3s are installed. Public static address with common router. Z3 is connected to this router.

 

Config

 

On Palo side

 

IPSec Crypto profile

 

IPSec Protocol ESP

DH group 2

LT 1h

Encryption aes-256-gcm/cbc

Authentication

sha256

 

IKW Crypto profile

DH Group

group2

Encryption

aes-256-cbc

Authentication

sha 256

Key LT 8h

IKEv2 Authentication Multiple 5

 

On Meraki side

 

Phase1

Encryption

AES 256

Authentication

SHA256

Pseudo-random Function

Defaults to Authentication

Diffie-Hellman group

2

Lifetime (sec)

28800

 

Phase2

Encryption

AES 256

Authentication

SHA256

PFS group

2

Liftime (sec)

3600

 

Palo Alto IKE GW Options

Passive mode Enabled

NAT-T Enabled

Advanced Option

Strict Cookie Validation turned off

Liveness Check

Interval (sec) 5

Hey @blind3d ,

 

Since you mentioned you are running an IKEv2 tunnel, I wonder if you may need to take a note of this:
https://documentation.meraki.com/MX/Site-to-site_VPN/Site-to-Site_VPN_Settings#NOTE_For_IKEv2

 

We've observed that certain vendors have not fallen into full compliance of the RFC for IKEv2, specifically this section that concerns the traffic selectors. 

 

It is definitely something worth querying with Palo Alto as well, to ensure their firewall is respecting the traffic selectors received.

 

Hope this helps!

 

Giac

Please keep in mind that what I post here is my personal knowledge and opinion. Don't take anything I say for the Holy Grail, but try and see!
Appreciate who helps and be respectful of every opinion and every solution offered.
Share the love, especially the Meraki one!
blind3d
Comes here often

@GiacomoS Thanks for sharing this.

 

The Palo Alto has one virtual router where all static routes match the desired subnet, e.g. Tunnel1 with 10.20.30.40/24 routed.

On the Meraki site, there is one global setting for the VPN tunnel

 

Name "doesnt matter"

IKE Version  "IKEv2"

Policies "see post above"

Public IP "Palo's IP"

Local ID "address reservation from the provider's router"; same for every site; e.g. 192.168.1.1

Remote ID "empty"

Private Subnets 0.0.0.0/0 - as we want a full-tunnel

PSK "Key"

Availability "All networks"

 

On the Palo site

Each IKE GW identifies with

 

Peer address "VPN site's public IP"

Local ID "Palo's IP"

Peer ID "192.168.1.1"

PSK "Key"

Each IPSec GW has a proxy ID named "whatever" with Local ID 0.0.0.0/0 and Peer Address, e.g. 10.20.30.40/24 (LAN behind Z3), Protocol any

 

If any of these settings don't match, I would assume the tunnel won't be established, but there are "just" random crashes of the network traffic, not the tunnel itself.

Hey @blind3d ,
No worries 🙂 

 

So, I'm not sure if it's a naming convention problem, but there were a couple of things that nagged me on the config you sent me. 

 

On the Meraki side, you have a remote ID set to "empty", yet on the Palo Alto's side, your local ID is set to the Palo Alto's IP.

I would normally recommend using the ID fields when one of the two sides is behind NAT; on the side where you have NAT you'd want to put the private IP on your uplink interface as your local ID, and you'll need to match it on the remote side with the Remote ID field. 

 

The other thing that I'm a bit confused about is the "IPSec GW" configuration on the Palo Alto side. Is that the essentially a subnet pair? Do you happen to have a screenshot of that section (even a mock one)?

 

@CCL_CO , in the majority of instances it seems to be a mismatch in either phase 2 configuration or the establishment of the SAs (possibly because of the KB I mentioned above), so there's no one-size fits all solution yet. Could you please share some more details around your configuration (please replace the IP addresses with fake ones!)? 

 

Many thanks!

 

Giac

Please keep in mind that what I post here is my personal knowledge and opinion. Don't take anything I say for the Holy Grail, but try and see!
Appreciate who helps and be respectful of every opinion and every solution offered.
Share the love, especially the Meraki one!

HOLA  @GiacomoS   Pudo solucionar el problema de la VPN hacia el palo alto?

Tengo el mismo problema, desde un MX68W hacia un palo alto PA-3260, la VPN de establece en ambos extremos y me deja pasar trafico 15 minutos, despues se tiene que reiniciar el tunel desde palo alto y vuelve y funciona. de pronto halla encontrado una solucion y me la pueda compartir. de antemano gracias.

HOLA @blind3d  Pudo solucionar el problema de la VPN hacia el palo alto?

Tengo el mismo problema, desde un MX68W hacia un palo alto PA-3260, la VPN de establece en ambos extremos y me deja pasar trafico 15 minutos, despues se tiene que reiniciar el tunel desde palo alto y vuelve y funciona. de pronto halla encontrado una solucion y me la pueda compartir. de antemano gracias.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels