Hi and Good day to All,
Just wondering if there's anyone encountered the same site-to-site VPN issue that I am currently having. I have two MXs (different geographic locations), and everything worked as usual. But since Thursday, I can't connect to the local network of the other end and vice versa. Looking at the VPN status all seems to be working ok (i.e. VPN Registry Connected, WAN appliance has publicly accessible IPs, Encrypted using IPsec and AES) with no error of any sort. I can also ping each other's public IP addresses from both MX's end. Even Meraki Support has confirmed that I have established a connection and that the issue could be ;
a.) ISP, or
b.) there might be a blockade on the supposed advertised subnet.
The situation is quite peculiar because at times it functioned properly, but then unexpectedly failed (for instance, last night the VPN connected, and this morning it disconnected again).
Notable Information:
I've noticed that some of the local VLANs on both sides have the same (VLAN and IP block). However, those VLAN IDs are not advertised to the site-to-site VPN. Could this subnet overlap be affecting our connection stability? Also, I am using IKEv1, would there be a difference if I switch to IKEv2?
Your feedback and suggestions are highly appreciated. Thank you All.
Hi,
so first of all - it sounds like you are using Non-Meraki VPN instead of Meraki Auto-VPN. Can you confirm that?
Having identical VLAN IDs/IP Subnets on both ends is not a problem as long as you don‘t route them into the VPN tunnel.
Can you provide any event log messages?
Have you done a Packet Capture of your WAN Interfaces for the IKE and/or IPSec Negotiation?
You could also do a Packet Capture for the same connection filtering for fragmentation since this could cause issues to your IPSec connection aswell.
IKEv2 is the more efficient, secure and reliable protocol and you should prefer it over IKEv1.
Regards,
Fabian
Hi Mr. Fabian,
Thank you for taking the time to respond to my inquiry. I’ve come to realize that you may be correct regarding our use of a non-Meraki VPN. Upon examining the packet capture, I noticed that the destination device is a Juniper machine (if I'm not mistaken). But both sites employ Mx devices (specifically, an Mx100 and an Mx85). There may be an uplink device connected to the Mx's at the other end of the tunnel.
This is the MX's event log from my side, forgive me as I can't disclose the public IPs (blue=my public IP, black=other end's public IP) nor I'm an expert on this field. As you can see theres an established connection.
Below is the screenshot from the MX's packet capture (pcap file). I observed a missing Sequence in the ESP (Encapsulating Security Payload). I’m uncertain whether this could be the cause of the issue.:
We will consider using the IKEv2 once we resolve this issue. Thank you Fabian.
Regards,
Jam
Wat does the other end of the VPN report? Do they see Phase 1 and Phase 2 up?
Hi @PhilipDAth and thank you for your reply.
Yes, kind of the same result from the other Mx's event log, although from my MX100 the inbound traffic has 0 bytes while on the other side it has 16,260 bytes;
Best regards,
Jam
It looks like your IKE SA's are quickly terminated from your second connection on.
There could be a strange state of one of the sides (firewalls) where they don't want to keep connections open anymore. It also takes alot of time to reestablish a new IKE SA after closing the previous one. It normally should take just a second, not over a minute.
It might be an idea to reboot the devices on both ends in a window.
Also please check the lifetime values on both ends (you could try a half day or a full day for the IKE SA and make sure all your networks match exactly at both ends.
Hi @GIdenJoe ,
Appreciate the time to comment on my inquiry. Yes, it is pretty odd, even the Meraki support having a hard time troubleshooting the issue as from the VPN status all are green/ ok. Non-Meraki event logs are clean but it won't show the negotiation phases.
MX100
msg: <remote-peer-2|1> CHILD_SA net-2{2} established with SPIs cd15bc5c(inbound) cfcd2475(outbound) and TS 192.168.1.0/24 === 192.168.0.0/24
msg: <remote-peer-2|1> IKE_SA remote-peer-2[1] established between My IP [1xx.2xx.xx.1xx]...Remote Peer's IP[1xx.2xx.xx.1xx]
MX85
msg: <remote-peer-2|3> CHILD_SA net-2{6} established with SPIs cfcd2475(inbound) cd15bc5c(outbound) and TS 192.168.0.0/24 === 192.168.1.0/24
msg: <remote-peer-2|3> IKE_SA remote-peer-2[3] established between Remote Peer's IP[1xx.2xx.xx.1xx]...My IP [1xx.2xx.xx.1xx]
Added to this, I changed the IKEv1 to v2. Again thank you very much.
regards,
Jam
No problem.
The logs can indeed only give you insight into the timings of the establishments and tear downs of each tunnel. Also when obvious errors are present you will get logging of it.
The only thing you can do is do extensive packet captures on the WAN side of your MX while filtering on port 500 or port 4500. You can see what devices does what and when.
However after the key negotiation every exchange is encrypted you are left guessing by seeing what device initiated something or killed something.
I hope they would enhance the logging even for established tunnels so you can actually see what the other side sent.
Hello Everyone,
This morning brought a pleasant surprise: the VPN link is now up and running smoothly. I can successfully access the remote peer’s subnet. The odd part? Well, I remain clueless about the root cause of the initial issue. Perhaps it’s related to SIP, but I can’t say for certain.
Thank you all for your valuable inputs and suggestions. If anything changes, I’ll be sure to provide an update. Have a fantastic day!
Best regards,
Jam
Hello,
Just giving you an update, since Tuesday morning, the link has once again gone down. I’ve now reached out to our ISP for assistance, as this issue doesn’t appear to be related to configuration but may be due to peculiar routing from our ISP side.
I hope they can find something that may have caused this issue. By the way, I read some articles from the Meraki Help page that this can happen.
Will get back to you if I have any new updates. Thank you
regards,
Jam