MTU issues after upgrading to MX18+

RaphaelL
Kind of a big deal
Kind of a big deal

MTU issues after upgrading to MX18+

Hi, 

 

It seems that Meraki has changed the ESP overhead size from 64 bytes to 68 bytes if you are running MX 18 ++

 

This may affect IP Fragmentation as shown below. 

 

This is a capture done on MX15.44. 

Client sends a maximum of 1408 UDP payload + 20 IP header = 1428. Then you add the encryption + 64 bytes = 1492 MTU , this fits almost any normal WAN link ( DSL and fiber and other ) 

RaphaelL_0-1692119308291.png

( top is 'Internet' capture , bottom is AutoVPN capture ) 

 

With MX18++ 1408 UDP payload + 20 IP header = 1428 if you add up the new 68 bytes = 1496 !

 

DSL links might not like that number ! We have encountered some ISP that instead of fragmenting those packets , they were simply dropping them.

 

I haven't seen any documentation / changelog regarding those 4 new ESP bytes ,  but the 68 bytes is now included in the MTU troubleshooting guide

  1. If your packet is traversing over Auto VPN, you will need to account for the 68 byte overhead when determining MTU size.

https://documentation.meraki.com/General_Administration/Tools_and_Troubleshooting/Troubleshooting_MT...

 

So , heads up for MTU issues ! 

 

Cheers , 

24 Replies 24
alemabrahao
Kind of a big deal
Kind of a big deal

These latest versions are increasingly unstable, several problems have been reported and it seems that Meraki is doing little to resolve. I'm in version 16.16.9 until today because whenever I update my network is unstable and I have to rollback.

I am not a Cisco Meraki employee. My suggestions are based on documentation of Meraki best practices and day-to-day experience.

Please, if this post was useful, leave your kudos and mark it as solved.
ww
Kind of a big deal
Kind of a big deal

It was already 68 bytes. So maybe something else was changed. The mx should take in account pppoe,  at least it did before

RaphaelL
Kind of a big deal
Kind of a big deal

PPPoE is 8 bytes. 

1506 - 14 L2 header = 1492 

 

1492 - 1428 = 64.  Same test ran with MX18 will give you 68 bytes.  

4 Bytes increase in ESP seems like an upgrade from SHA1 to SHA256. I'm asking support confirmation on how and why he confirmed that the ESP overhead has changed. 

ww
Kind of a big deal
Kind of a big deal

I asked over a year ago, and they told me 68 bytes. I dont remember if it was 15 or 16 firmware back then. Did you also test on 16.x?

RaphaelL
Kind of a big deal
Kind of a big deal

Trying to pin that firmware on my test setup ! I also asked support if it is since MX18 or since our upgrade ( big jump from 15 -> 18 )

RaphaelL
Kind of a big deal
Kind of a big deal

From support : 

Yes this is expected for MX18+

I do not currently have the exact details for why the overhead has increased, but we have made improvements to the VPN design that have resulted in this overhead. There should be no expectation that this will ever return to the previous overhead size in firmware versions greater than MX18

 

Still chasing in which firmware version this was changed.

Bucket
Getting noticed

I is 4 bytes lower on MX16.

RaphaelL
Kind of a big deal
Kind of a big deal

We changed the MTU of a MX ( running 18.107 ) HUB in a lab from 1500 to 1468.  The advertised MSS is now 1360 ( 1468 - ESP overhead - IP - TCP = 1360 )

 

Taking a packet capture shows ESP payload of 1400 bytes and total WAN packet lenght is 1474. 1474-L2 header = 1464. Which again gives a ESP overhead of 64 bytes instead of 68 bytes.  That's curious. 

 

RaphaelL_0-1692709471039.png

RaphaelL_1-1692709506496.png

 

There is something funky going on or I'm missing something obvious written in a RFC older than me.

HarryCalayan
Comes here often

We tried to reach out meraki support but they not responding if the case is related to MTU.

RaphaelL
Kind of a big deal
Kind of a big deal

What are you experiencing ?

HarryCalayan
Comes here often

Our biometric devices in different sites cannot sync to our main office server since we are using AutoVPN. When the support from meraki identified that biometric devices send above the MTU (i think around 1500bytes). They stop replying on the case and left the ticket open.

Thanks for your reply.

RaphaelL
Kind of a big deal
Kind of a big deal

Is it over UDP or TCP ?

 

The MX should have done IP fragmentation. 

HarryCalayan
Comes here often

its over TCP. 

They just inform us that the header DF is set to 1. Unfortunately we don't have idea where to change any settings since its only a biometrics.

Some of our sites works fine, I can ping other vpn subnet with 1500 bytes and those biometric devices sync with the server. But some of sites are dropping the packets from the biometric.

RaphaelL
Kind of a big deal
Kind of a big deal

Mmmm.... take a packet capture , you will see that the TCP 3WAY handshake contains the MSS ( Maximum segment size for TCP ). It will be  AutoVPN MTU - 40.  

 

1500-68-40 = 1392 ( if you WAN MTU is 1500 all across your AutoVPN domain ) it might be lower. 

 

That means that the client is aware that it cannot send over 1392 bytes to the destination over VPN.

HarryCalayan
Comes here often

Yes, the client device is aware that it cannot send over 1392 bytes, but based on the packet capture the device still keep trying to send around 1500 bytes multiple times. Then we will get a timeout error. Currently we don't have any idea what to do 😄 😄 😄

RaphaelL
Kind of a big deal
Kind of a big deal

if the client is not honoring the MSS recevied then it is not a meraki issue. I could contact the vendor with the appropriate pcaps.  

 

However , I have never seen a IP stack behaving like that,  so I have some doubts.

HarryCalayan
Comes here often

When i tried to ping the spoke MX with 1500bytes it gives the packet fragmentation message w/c is expected. But when i tried around 1399 up to around 1450 bytes, it just timed out.

Edit: 1398 works but not 1400 is timed out

HarryCalayan_0-1693494456765.png

HarryCalayan_0-1693495183293.png

 

 

Edit: I have 2 different spokes MX here with different behavior when i ping with 1500bytes

HarryCalayan_1-1693495485989.png

 

HarryCalayan
Comes here often

I don't think that MX is informing the client device that the packet needs to be fragmented.

HarryCalayan_0-1693577138995.pngHarryCalayan_1-1693577147357.png

 

RaphaelL
Kind of a big deal
Kind of a big deal

Because there is no fragmentation happening.  1392 + 68 + 40 = 1500.

HarryCalayan
Comes here often

Is that mean the MX just drop the packet? Thanks

RaphaelL
Kind of a big deal
Kind of a big deal

Why would  it ? The payload size received is smaller or equal to it's WAN / AutoVPN MTU

HarryCalayan
Comes here often

As per meraki support the packet is not received in the other side of vpn. So we don't have any idea.

RaphaelL
Kind of a big deal
Kind of a big deal

Take a packet capture on the spoke and the hub , you will see if the packet is received or not. 

 

Does your WAN support 1500 bytes MTU ? On both side

HarryCalayan
Comes here often

Yes, both WAN of hub and spoke support 1500 bytes

Get notified when there are additional replies to this discussion.