OpenVPN/Troubleshooting

From Secure Computing Wiki
Revision as of 13:38, 26 October 2010 by Ecrist (Talk | contribs) (How do I find the Path MTU?: fix tom's shit)

Jump to: navigation, search

Troubleshooting for various issues..expand to some sort of hierarchy when there's more than one topic here..

MTU Issues

MTU issues can manifest themselves in all kinds of annoying ways that are difficult to diagnose and look like other problems. That being said, identifying an MTU problem isn't too difficult if you look for it specifically.

Common Symptoms

  • Connections hang "mid-stream".
  • Some traffic works fine (e.g. SSH sessions), but other traffic fails (e.g. file transfers)
  • Inconsistent behavior in the same protocol, such as some web pages loading just fine, but most failing. Or an SSH connection that authenticates fine, then hangs when you do something like catting a file.

What causes the problem?

Without getting too deep into network theory, an MTU defines the largest packet (Layer 3 PDU) that may be sent over a given network segment. This maximum may be based on a hardware limitation (10/100 Ethernet networks almost always have a 1500-byte MTU), or a limit set arbitrarily in software. As a packet crosses an Internetwork, it may cross segments that have differing MTUs. This is normally not a problem, because there are mechanisms to handle these cases. In the olden days, it was common for routers to break up packets into manageable sized fragments before sending them over a network segment with a too-small MTU. However, this is costly in terms of resources on routers. Breaking up packets at one end and reassembling them at the other is a lot of work. More often, when a router receives a packet that it cannot transmit because of the size, it sends an ICMP message back to the sending host, telling it that it must re-send, using an acceptable packet size (indicated in the ICMP message). Normally this process works well. It runs into problems when the ICMP packet doesn't make it back to the sender. This can be caused by any number of issues, including mis-configured firewalls that block/ignore ICMP "Fragmentation needed" messages, VLAN misconfigurations, MTU configuration issues on intermediary routers, etc. The end-result is always the same, the sender doesn't know that it needs to send smaller packets so it happily tries retransmitting big packets until the connection dies.

How do I identify an MTU issue?

The best indicator of an MTU problem is that small packets flow fine, but large packets never reach their destination. This makes a packet dump utility such as tcpdump or Wireshark your best weapon for hunting down the problem. You can use it to look for packets that leave the source host and never arrive at the destination host. It is very useful if you can run a packet dump at each endpoint. Sometimes MTU problems (misconfiguration issues in particular) only manifest themselves in one direction, so a packet of length x may pass in one direction, but not in the other. In fact, the more points you control (and can dump packets at) along the path, the better.

Look at the traffic dump while generating some relevant traffic. If there is an MTU problem, you will likely see the initial TCP handshake (SYN-SYN/ACK-ACK) succeed, followed by a bunch of retransmitted packets when big packets start getting transmitted. Good candidates for generating traffic include Web pages with images, FTP transfers, and even interactive SSH sessions (should log in fine, then croak when you cat a large file).

I think there's an MTU issue, what now?

There are a couple of steps that will help you solve your problem.

  1. Find the maximum packet size packet (called the Path MTU) that will go from point A to point B and back.
  2. Find out where the traffic flow breaks down with larger packets (track down the culprit).

How do I find the Path MTU?

The ping utility is another great tool for tracking down MTU problems because you can specify an arbitrary packet size. Ping your destination host with increasingly large packets until you either get a fragmentation needed message, or no response at all. Then decrease the packet size by smaller increments until you once again get a response. For example, I generate pings with payload sizes of 100, 400, 1000, 1200, and 1300 bytes. In linux this is done like this:
tom@edge:~-> ping -c 1 -s 100 -M do foobar-1
tom@edge:~-> ping -c 1 -s 400 -M do foobar-1
tom@edge:~-> ping -c 1 -s 1000 -M do foobar-1
tom@edge:~-> ping -c 1 -s 1200 -M do foobar-1
tom@edge:~-> ping -c 1 -s 1300 -M do foobar-1
In BSD/osx this would look like:
 
ping -c 1 -s 400 -D foobar-1
ping -c 1 -s 1000 -D foobar-1
ping -c 1 -s 1200 -D foobar-1
ping -c 1 -s 1300 -D foobar-1
And in windows:
 
ping -c 1 -l 400 -f foorbar-1
ping -c 1 -l 1000 -f foorbar-1
ping -c 1 -l 1200 -f foorbar-1
ping -c 1 -l 1300 -f foorbar-1
Looking at this with tcpdump, I see the following packet flow. Notice that the displayed packet length is 28 bytes larger than the specified size. This is due to the overhead created by the IP header (20 bytes) and the ICMP headers (8 bytes).
edge:~# tcpdump -i eth0 -tvn host foobar-1 or icmp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 128)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6863, seq 1, length 108
IP (tos 0x0, ttl 53, id 15136, offset 0, flags [DF], proto ICMP (1), length 128)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6863, seq 1, length 108
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 428)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6864, seq 1, length 408
IP (tos 0x0, ttl 53, id 15150, offset 0, flags [DF], proto ICMP (1), length 428)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6864, seq 1, length 408
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1028)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6865, seq 1, length 1008
IP (tos 0x0, ttl 53, id 15236, offset 0, flags [DF], proto ICMP (1), length 1028)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6865, seq 1, length 1008
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1228)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6866, seq 1, length 1208
IP (tos 0x0, ttl 53, id 15258, offset 0, flags [DF], proto ICMP (1), length 1228)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6866, seq 1, length 1208
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1328)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6867, seq 1, length 1308
IP (tos 0x0, ttl 53, id 15272, offset 0, flags [DF], proto ICMP (1), length 1328)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6867, seq 1, length 1308

The next test is where things get interesting. We attempt to send a 1400 byte payload and receive no response. This indicates that we have attempted to send a packet that is too large for a hop along the path. Next, we decrease the payload size (1380) until we again get a response (1360).

IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1428)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6883, seq 1, length 1408
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1408)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6885, seq 1, length 1388
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1388)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6886, seq 1, length 1368
IP (tos 0x0, ttl 53, id 23790, offset 0, flags [DF], proto ICMP (1), length 1388)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6886, seq 1, length 1368
Continued trial and error shows us that payload size 1370 works, 1374 and 1373 are too large, but 1372 works.
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1398)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6896, seq 1, length 1378
IP (tos 0x0, ttl 53, id 27893, offset 0, flags [DF], proto ICMP (1), length 1398)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6896, seq 1, length 1378
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1402)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6899, seq 1, length 1382
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1401)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6910, seq 1, length 1381
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 1400)
    10.32.32.30 > 192.168.55.66: ICMP echo request, id 6912, seq 1, length 1380
IP (tos 0x0, ttl 53, id 33714, offset 0, flags [DF], proto ICMP (1), length 1400)
    192.168.55.66 > 10.32.32.30: ICMP echo reply, id 6912, seq 1, length 1380

So 1400 bytes (20B IP header + 8B ICMP header + 1372B payload) appears to be the largest packet that can be sent end-to-end.