Troubleshooting MTU size over IPSEC VPN

I recently deployed a couple of wireless access points to two sites that connect to our main office over IPSEC VPN. After a recent firmware update to the wireless controller both access points got stuck in a provisioning loop and appeared to have difficulty communicating with the controller. Both AP’s repeatedly disconnected due to a “heartbeats lost” error.

Connectivity between the main office and the remote sites appeared fine. Both access points were reachable via ping and ssh. I set up a packet debug on both sites’ firewalls and saw traffic going back and forth between the access points and the controller, and both access points appeared on the controller status window, alternating between “Provisioning” and “Disconnected”.

Needless to say I was slightly baffled.

I opened a ticket with the wireless vendor and (very quickly) received an answer. The MTU for CAPWAP traffic between the access points and the controller is hard set by the controller to 1500*. With these sites connected via IPSEC, that was going to cause some fragmentation due to the overhead that IPSEC was going to add onto the traffic going between sites.

I needed to lower the MTU size on the controller, but to what value? IPSEC doesn’t seem to have a ‘fixed’ header size due to the different encryption options that can be used. So how do I find out exactly how much our particular IPSEC configuration is adding?

ping -f

The -f flag from a Windows command prompt prevents an ICMP packet from being fragmented. This, combined with the -l flag allows you to set the size of the ICMP packet being sent.

So, assuming a standard ethernet MTU of 1500, and accounting for an 8-byte ICMP header, and 20-byte IP header, I should be able to send an ICMP packet sized to 1472 bytes, but 1473 should be too large:

C:\Users\netcanuck>ping 172.16.32.1 -f -l 1472

Pinging 172.16.32.1 with 1472 bytes of data:
Reply from 172.16.32.1: bytes=1472 time=3ms TTL=251
Reply from 172.16.32.1: bytes=1472 time=4ms TTL=251
Reply from 172.16.32.1: bytes=1472 time=4ms TTL=251
Reply from 172.16.32.1: bytes=1472 time=3ms TTL=251

C:\Users\netcanuck>ping 172.16.32.1 -f -l 1473

Pinging 172.16.32.1 with 1473 bytes of data:
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.

Excellent! So now to test across our IPSEC tunnel:

C:\Users\netcanuck>ping 172.16.68.1 -f -l 1472

Pinging 172.16.68.1 with 1472 bytes of data:
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.

Now this makes sense. The MTU size does not account for the IPSEC overhead.

After some testing with different packet sizes I hit on the magic number: 1384 bytes. At 1385 the packets were again rejected as being too large. So some quick math:

ICMP payload: 1384 bytes

ICMP header: 8 bytes

IP header: 20 bytes

Subtotal: 1412 bytes

This leaves 88 bytes as the IPSEC header. I should be able to set the MTU size on the controller to 1412 and the access points should resume functioning normally.

I did in fact set the MTU to 1400 – I like nice, round numbers – and sure enough both access points resumed proper communication with the controller.

What I Learned Today

Sometimes the simple tools are easy to overlook. Using a standard Windows command prompt and ping using the -f  flag is a quick and easy way to diagnose MTU and fragmentation issues across a VPN tunnel.

* It appears from the support documentation for this particular wireless vendor that the MTU size should be 1450 by default which should take into account at least some overhead and explains why these access points were working fine until now. The firmware update seems to have changed this to 1500.

12 thoughts on “Troubleshooting MTU size over IPSEC VPN

  1. Do you mind if I quote a couple of your articles as long as I provide credit and sources back to your blog? My blog is in the very same area of interest as yours and my users would certainly benefit from a lot of the information you provide here. Please let me know if this okay with you. Thanks a lot!

  2. Hi,

    Could you explain, how didi you count this IPSec overhead bytes.
    Say if i am sending 46Byte clear text msg from simulator then what would be the exact msg size with ipsec overhead.

    Thanks,
    Santosh

  3. Hello, I have simliar issue where AP is behind the firewall and WLC is at the hub side. We have ipsec tunnel between router and firewall.AP is not getting registered to WLC . I did change MTU at the WLC to 1400. My WLC is Ruckus only. can you please advise me any other thing here? its been 4 days since i am working on this..

  4. Hello, I have similar issue where AP is behind the firewall and WLC is at the hub side. We have ipsec tunnel between router and firewall.AP is not getting registered to WLC . I did change MTU at the WLC to 1400. My WLC is Ruckus only. can you please advise me any other thing here? its been 4 days since i am working on this..Thnkx for inputs!

  5. On a unix machine (a Mac running 10.9.5 in my case) you can use -D for don’t fragment and -s for the size of the packet.

    Ben

  6. I think we have the same problem, but with a netgear WC 7600 controller.
    Has anybody find a sollution to set packetsize on the netgear?

    regards,

    Laurens

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s