Traffic not getting initiated from IKEv1 and IKEv2 for non-meraki tunnel for few selectors

Solved
Tishman
Here to help

Traffic not getting initiated from IKEv1 and IKEv2 for non-meraki tunnel for few selectors

Hi Experts

 

I had created a site-to site tunnel with non-meraki device FTD with IKEv1 tunnel come up but for few traffic selectors traffic is not getting initiated from meraki but it works when initiated from FTD.

 

MX version 18.211.2

 

does anyone have any fix  as same is happening with IKEv2 when using FQDN.

 

1 Accepted Solution
RaphaelL
Kind of a big deal
Kind of a big deal

Its seems sometimes random when one of the traffic selectors will fail.

But we can bring the specific failing selector up with this line in packet-tracer from the ASA side:

packet-tracer input inside tcp [INSIDE SOURCE IP] [DESTINATION IP] 443 detail

 

 

#### Exact same issue that we are experiencing. Nothing useful from Support up to now... 

View solution in original post

32 Replies 32
RaphaelL
Kind of a big deal
Kind of a big deal

Hi 

 

Pretty sure I have the exact same issue that you have.

 

Rebooting the MX or up/down the tunnel seems to work for a couple hours then it stops working again. 

 

Knowguy
Getting noticed

I have been fighting this exact same issue between an MX (both 18.107.10 and 18.211.2) and ASA (running latest suggested code, 9.18 interim).

 

Support has both suggested trying the patch 18.211.3 to see if it fixes the issue and also saying they don't believe that will fix the issue.

 

I have changed about everything that I can on the ASA side including disabling keepalives and NAT-T.

Adding 'route-lookup' to the end of the NAT exemption for the tunnel traffic

Adding and subsequently removing the IKEv1 command 'reverse-route'

Reordering ACL on ASA to match the order of private subnets on the Meraki.

Its seems sometimes random when one of the traffic selectors will fail.

But we can bring the specific failing selector up with this line in packet-tracer from the ASA side:

packet-tracer input inside tcp [INSIDE SOURCE IP] [DESTINATION IP] 443 detail

I have had two calls between TAC and Meraki Support and no solutions have been found.

Meraki supports last action was to disable multi-core support on the backend for the MX67 in my case since it has been known to cause issues on the lower end devices.

I thought this was going to fix the issue, it was stable for nearly 5 days and began to happen again... I think it was simply the reboot that fixed it which was necessary for the backend change.

I am actively trying to get this case escalated on both supports sides but having no luck. Went straight to the Cisco Rep too... nothing as of yet but I will let anyone know if I fix this.

I feel like it may have something todo with DPD / keepalives that at least one Meraki support engineer said on their end would be 'interval 10 retry 3' I belive the default on the ASA may be 'interval 10 retry 2'... But no telling what it is at this point.

RaphaelL
Kind of a big deal
Kind of a big deal

Its seems sometimes random when one of the traffic selectors will fail.

But we can bring the specific failing selector up with this line in packet-tracer from the ASA side:

packet-tracer input inside tcp [INSIDE SOURCE IP] [DESTINATION IP] 443 detail

 

 

#### Exact same issue that we are experiencing. Nothing useful from Support up to now... 

Knowguy
Getting noticed

Looks like support actually answered a similar post from Tishman here about upgrading to the patch I mentioned 18.211.3 but again nothing in release notes to indicate a fix for what we are seeing:

 

https://community.meraki.com/t5/Cloud-Security-SD-WAN-vMX/traffic-not-getting-initiated-from-IKEv1-f...

RaphaelL
Kind of a big deal
Kind of a big deal

18.211.3 hasn't fixed anything related to NMVPN for us. 

 

on a side note , really tired of the premade excuse : upgrade to the latest code and pray god that your bugs are no longer present.

Knowguy
Getting noticed

Thank you for confirming that without me having another false hope fix! Hopefully we get some visibility from this discussion so please keep checking in if you learn anything new.

 

Not excited because I manage a few very similar environments that so far have avoided this….


Which doesn’t help when trying to troubleshoot to compare the other setups as I have already tested and reviewed all of the configuration differences to no avail.

Tishman
Here to help

agreed but that is just work around each time we have to do that meraki should fix there issue

Knowguy
Getting noticed

So again first thing this morning one of our traffic selectors to a single host IP was not coming up from initiating on the MX side.

I found this post for zScaler to MX and see them talking about recommended settings for Site-to-Site on the Meraki side:

Follow these recommendations:

  • Security & SD-WAN -> Configure: Site-to-site VPN -> Non Meraki VPN settings:
  • Preshared secret must be greater than 14 characters
  • Authentication cannot be MD5
  • Diffie-Hellman Group must be 14
  • Phase 2 encryption cannot be NULL
  • PFS can be configured to be either off or 14

 

The only thing we are doing wrong may be that we thought the pre-shared key needed to be under 14 characters, probably another forum resource post somewhere.

 

I am not sure why they are recommendations and where its coming from other than others experiences

I have scanned through the forum and have seen numerous posts now for similar issues that we are communicating about here:

https://community.meraki.com/t5/Security-SD-WAN/NON-MERAKI-Site-To-Site-VPN-network-translation-v18-...


https://community.meraki.com/t5/Security-SD-WAN/Non-Meraki-VPN-IKEv2-issues/td-p/245734/


https://community.meraki.com/t5/Security-SD-WAN/MX-to-Cisco-FTD-Site-to-Site-Using-IKEv2/td-p/245108


https://community.meraki.com/t5/Security-SD-WAN/Cannot-establish-VPN-to-non-Meraki-peer-Firepower/td...


https://community.meraki.com/t5/Cloud-Security-SD-WAN-vMX/traffic-not-getting-initiated-from-IKEv1-f...


https://community.meraki.com/t5/Security-SD-WAN/non-Meraki-VPN-peer-is-not-establishing-with-zScaler...

RaphaelL
Kind of a big deal
Kind of a big deal

We are already following these recommendations and still this morning one trafic selector is not working

Knowguy
Getting noticed

So as I continue to watch the tunnel I see the VPN Registry: Partially connected warning. So I found another forum post here detailing what this means

2024-09-11 11_13_24-Clipboard.png

https://community.meraki.com/t5/Security-SD-WAN/VPN-Registry-Partially-connected-What-does-this-mean...

 

Appears that you can find what ports your registry is using with the following information and a Meraki Support Engineer telling us about having them change the registry values to fix issues:

 

From JosRus | Meraki Employee

 

I would like to add some additional information to this:

When support initiates a change to your registry contact points, a migration period will occur, within which changes to your registry contact points cannot be performed. Any additional changes to these will require an additional waiting period while the migration finishes.

An additional port of 9351 has been added, which you will see is also listed under Help>Firewall Info>VPN registry. Upon issuing a registry IP change from our side, you will see the addresses on this page update automatically, so be sure to check this page after any registry IP change is made from the Meraki Support side, and update your upstream firewall/device rules with the new information accordingly.

 

 

RaphaelL
Kind of a big deal
Kind of a big deal

Good point but I think that this only applies to AutoVPN. This shouldn't affect NMVPN

Knowguy
Getting noticed

Thought I would also throw out this other random mention of a bug that is not documented anywhere externally. Support is claiming there was a know issue with IKEv1 dropping random SAs:

Case 12086900
Tunnels should not have taken that long to reform. Could be running into a known issue with IKEv1 connections on 18.107.10+ and 18.209+ firmware where we can't reform specific child sa's.

Has anyone tested this theory by possible downgrading to any of the following?

  • 18.107.8
  • 18.170.9
  • 18.208
  • 18.208.01

 

All of these were considered "Stable" versions at some point.

RaphaelL
Kind of a big deal
Kind of a big deal

Is it specific to IKEv1 ?  I could ask Support to pin us to 18.208

Knowguy
Getting noticed

That's what support said in my case but they never actually confirmed or denied this. Feel free to reference my ticket number. They were unsure if the lower end (MX67) in my case were stable on 18.208 or 18.107

Knowguy
Getting noticed

So first thing this morning, All of the traffic selectors survived through the night and are still up and incrementing encaps/decaps.

Knowguy
Getting noticed

Anyway we can mark this as not solved? We confirmed how to manually fix this but we need this to be properly addressed by Meraki.

RaphaelL
Kind of a big deal
Kind of a big deal

What is the manual fix ?

RaphaelL
Kind of a big deal
Kind of a big deal

Just had a tshoot session with support , they mentionned other customers with the same issue. Suggested to go to MX 19.1.3 which would contain an undocumented fix about that issue. Will comment if it works or not. Stay tuned.

Knowguy
Getting noticed

You are very brave for doing this. But thank you. I will be watching. As for the manual fix above. I am just talking about being able to bring up the non-working traffic selectors using packet-tracer from the ASA side in my case.

RaphaelL
Kind of a big deal
Kind of a big deal

So far so good ! 

Knowguy
Getting noticed

Give it 3 or 4 days. When you have to reboot an MX it seems to stabilzie it for some time. When support initially rebooted our MX pair we were troubleshooting to disable some backend multicore support it was stable for about 4 days and then started to repeat the same behavior again. 

Knowguy
Getting noticed

I wonder if the outage yesterday and the fixes applied after reboot for MX running 18.211.x was the fix for all of the IPSec issues and that if I roll back to 18.211.2 if that will have fixed everything.

https://community.meraki.com/t5/Security-SD-WAN/Issues-with-AutoVPN-incident/td-p/248249/jump-to/fir...

We are hesitantly about to purchase another MX to fix the issue but we really don't want to do this if we don't have to.

Knowguy
Getting noticed

After 3 days of our own MX being on 19.1.3 one of the traffic selectors went down again and was not able to re-initate from the MX side. 😡

RaphaelL
Kind of a big deal
Kind of a big deal

Sad news to hear. I'm still testing. Up to now everything is fine,  but it may take more time for a selector to go down. If that's the case I will call Support again and investigate more. 

Knowguy
Getting noticed

I need to ask you what you are using for Phase 1 and Phase 2 settings on both sides though if yours does stay up

Knowguy
Getting noticed

Very sad news indeed, same issues have cropped up again. I swear it looks like the IKEv1 tunnel just can't handle more than 2 or 3 traffic selectors. This particular instance there are a ton, keep in mind these are all randomly generated addresses but will help with the example:

MX Private Networks:
172.31.10.0/23
172.31.20.0/29


ASA Private Networks
10.100.100.0/24
10.9.102.0/24
123.88.139.8/32
123.88.139.19/32
123.88.139.25/32
123.88.139.92/32
148.66.118.201/32
148.66.116.78/32
148.66.118.202/32
148.66.116.79/32
148.66.99.74/32
10.5.131.36/32
10.5.131.37/32
172.22.26.0/24

The bolded ASA Private networks are the only ones we really care about and that are up most of the time. This effectively creates 6 (2 x 3) traffic selectors between the two devices.

TS - 172.31.10.0/23 === 10.100.100.0/24
TS - 172.31.10.0/23 === 10.9.102.0/24

TS - 172.31.20.0/29 === 10.100.100.0/24
TS - 172.31.20.0/29 === 10.9.102.0/24

TS - 172.31.10.0/23 === 123.88.139.8/32

TS - 172.31.20.0/29 === 123.88.139.8/32

It looks like once it goes over 3 the next time a selector is re-keyed... its a no go.

Tishman
Here to help

Is there any way we can decrypt the pcap files taken for vpn traffic from meraki it will make more picture clear for this issue.

CJHarms
Here to help

Any news regarding this? We seem to be having the same Issue with a IPSec IKEv2 Tunnel with a FortiGate Peer on the MX 18.211.3. We have a Case open (12167713) but not a Solution yet.

 

We need to remove and re-add the Network Tag for the specific VPN Peer to get the Tunnel working again as a Workaround for now.

Knowguy
Getting noticed

No news, I literally just had to bounce the Meraki side by making a change to the Site-to-site side of the tunnel which then brought all traffic selectors back up.

I can do a similar thing to the FortiGate on our ASA side and bring the tunnel up as well.

How many remote networks do you have on each side of your tunnel? I feel like we had some success in reducing the amount of networks configured on each side.

CJHarms
Here to help

For us this is only related to one (of three total) IPsec IKEv2 Peers with a Single Network but this seems to have started with the latest MX 18.211.3 Firmware.

 

We tried running a Ping via our Monitoring System to the Remote Network to keep the IPSec Tunnel up - but no luck so far. After 1 or 2 Days max the Phase 2 part of the Tunnel always crashes and no Traffic is being forwarded.

Knowguy
Getting noticed

So it doesn't seem to matter, I know there are definitely issues with IKEv2 I would try and do IKEv1 if possible but even that is broken it seems.

We have thrown everything at this problem and we have given up and are about to add an MX to the other side to hopefully resolve this

CJHarms
Here to help

Yeah might try to downgrade to IKEv1 and hope that this fixes it. Unfortunately the other Side is a Partner and not under our control or we would already have deployed a MX.

 

Let's see if Support comes up with a fix or workaround.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels