Hi Tom,

Ran this past engineering:

> Vlan is level two information. A VLAN packet has a different type in the
Ethernet header, which is read by the card driver. So a VLAN aware driver will
allow a packet which physical size is 1518 bytes long (1500 bytes of payload +
2*6 of ethernet address + 2 bytes of type/len + 4 bytes VLAN  extra info)
instead of the normal 1514.

Yep.  Typically, drivers will allow even larger packets than this, too, since
they'll allow frames with bridge headers, too.

> On the other hand, IPSec (more precisely ESP and AH) are IP protocols.  I.e.
the ethernet drivers knows nothing about it. And an IPSec packet can be
transported in an ethernet packet, a vlan packet or over a ppp connection. It is
IP. Plus, the overhead of IPSec is a lot more than 4 bytes, more 40 bytes or so,
but I don't remember the exact value.

It's about 32-36, depending on the packet and without compression, for an ESP
packet.

> So my recollection is as followed:
>
> - the unpatched drivers on our Linux box were dumb and would simply drop
packets that where too big.

That used to be the case, but has been resolved by the community.  Of course,
commercial vendors that run Linux solved this long ago, too.  A Linux system
will display and use the proper MTU.

> But this has no bearing on IPSec. This is a different ball games. And that's
why I was asking the question: what is it for? To create tunnels for you and
they need to have 1500 MTU? Or to create tunnels for the customers and it is
then a non-issue: they'll have to deal with the lower MTU size of the IPSec
tunnel and most of the time it just works (thanks to path MTU discovery).

As an expansion on that point, PMTU is just as important--if you have a
bottleneck somewhere in the middle of the network that only accepts a smaller
packet, you'll encounter problems.  MTU path discovery can help, but it is
unreliable and not always available.

> To clarify. The MTU is only the size of the payload. It doesn't take into
account the Ethernet header. Of course, the IP header, TCP/UDP header, etc. are
considered payload for ethernet and indeed counted in the ethernet payload.

This is incorrect.  The MTU is the size of the packet less non-TCP headers, as
you mentioned above.  It considers the entire packet with all headers attached.
The MSS is the value that you are defining here--the size of the allowable
payload.  The MSS is negotiated during the SYN and SYN/ACK phases of TCP.

> There are two MTU to consider. The MTU of the underlying ethernet interface
and the MTU of the VLAN interface itself. The second MTU is the "effective" MTU,
the one seen by application, networks, using this interface. The first MTU is
the one of the hardware interface.

I think that calling the second value an MTU is a misnomer.  The IPSec interface
has an MTU that is an actual MTU (not an "effective" one), and it will be lower
than either the VLAN or Ethernet interface upon which that VPN rides.

> The trick used by StarOS is to reduced the "effective" MTU.

I think the term you are searching to find is MSS.  Either way, the result is
the same: you get less payload so that the packet (headers+payload) fits within
a "normal" MTU.

> Therefore, gaining 4 bytes off the payload to expand the header into it,
without the underlying interface having to be aware of it. If it was possible,
leaving the effective MTU at the same value and increasing the underlying
interface MTU by 4 bytes would have the same effect.

Exactly, though just to be clearer, you're talking about dropping the MSS, which
would lower the MTU as well (all other things being equal).

> The proper VLAN aware drivers show 1500 MTU for both the underlying interface
and the VLAN interface but it treats VLAN packets with caution, so as not to
truncate or drop them because of their longer size.

If that's true, then it isn't a "proper VLAN aware driver."  The MTU should be
set correctly and not just show 1500 and use something else.

> >  I know the gigabit ports would, but not the Mikrotik 100mbps ports?

Actually, not all GigE ports will have jumbo frames enabled.  It's not a safe
assumption that your packets won't get fragmented on a GigE port.

> > So I'm not even sure how to test :-)

> You have to prevent or detect fragmentation to know what's going on.  With
ping, the option '-M do' will set the DF flag (don't fragment).

> The test is to see that without fragmentation, you can ping with '-s 1468' and
not with '-s 1472'. This would indicate a VLAN MTU issue.

> Sniffing with tcpdump, where appropriate, is also very informative. In
particular look at the flags: [DF] means that the don't fragment flag is set,
[+] means that the more fragment to come flag is set (i.e. the message is
fragmented). Examples:

> # sudo tcpdump -i eth4 -l -n -v icmp
> tcpdump: listening on eth4, link-type EN10MB (Ethernet), capture size
> 68 bytes
> 19:05:27.714176 IP (tos 0x0, ttl  64, id 12940, offset 0, flags [DF],
> length: 1500) 10.0.162.1 > 10.0.162.3: icmp 1480: echo request seq 0
> 19:05:27.761057 IP (tos 0x0, ttl  32, id 56852, offset 0, flags [DF],
> length: 1500) 10.0.162.3 > 10.0.162.1: icmp 1480: echo reply seq 0
> 19:05:43.667823 IP (tos 0x0, ttl  64, id 62485, offset 0, flags [+],
> length: 1500) 10.0.162.1 > 10.0.162.3: icmp 1480: echo request seq 0
> 19:05:43.667834 IP (tos 0x0, ttl  64, id 62485, offset 1480, flags 
> [none],
> length: 21) 10.0.162.1 > 10.0.162.3: icmp
> 19:05:44.665582 IP (tos 0x0, ttl  64, id 52822, offset 0, flags [+],
> length: 1500) 10.0.162.1 > 10.0.162.3: icmp 1480: echo request seq 256
> 19:05:44.665592 IP (tos 0x0, ttl  64, id 52822, offset 1480, flags 
> [none],
> length: 21) 10.0.162.1 > 10.0.162.3: icmp
> 19:09:11.485566 IP (tos 0x0, ttl  64, id 25938, offset 0, flags [+],
> length: 1500) 10.0.162.1 > 10.0.162.4: icmp 1480: echo request seq 768
> 19:09:11.485576 IP (tos 0x0, ttl  64, id 25938, offset 1480, flags 
> [none],
> length: 21) 10.0.162.1 > 10.0.162.4: icmp
> 19:09:11.492506 IP (tos 0x0, ttl  64, id 18866, offset 0, flags [+],
> length: 1500) 10.0.162.4 > 10.0.162.1: icmp 1480: echo reply seq 768
> 19:09:11.492811 IP (tos 0x0, ttl  64, id 18866, offset 1480, flags 
> [none],
> length: 21) 10.0.162.4 > 10.0.162.1: icmp

This is the best suggestion for finding the problem, if you know which node it
is causing trouble (or suspect).  You can also use other tools like tcpspray to
make sure that the problem isn't ICMP-specific (or different for ICMP than TCP).

Jeff



Tom DeReggi
RapidDSL & Wireless, Inc
IntAirNet- Fixed Wireless Broadband


--
WISPA Wireless List: wireless@wispa.org

Subscribe/Unsubscribe:
http://lists.wispa.org/mailman/listinfo/wireless

Archives: http://lists.wispa.org/pipermail/wireless/


-- 
WISPA Wireless List: wireless@wispa.org

Subscribe/Unsubscribe:
http://lists.wispa.org/mailman/listinfo/wireless

Archives: http://lists.wispa.org/pipermail/wireless/

Reply via email to