Thread: [6bed4-devel] Tunnel MTU fixed at 1280

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hey,

In section 3.5 it is stated tunnel MTU is set to 1280 as a MUST. What's the reasoning behind this requirement?

Joost

Hi Joost,

> In section 3.5 it is stated tunnel MTU is set to 1280 as a MUST. What's the reasoning behind this requirement?

This is the minimum MTU that's guaranteed for IPv6 traffic.
In an anycast system, there's no way of knowing what server
will handle the traffic, which could lead to trouble.  So
instead of risking that, I've specified the minimum MTU as
the reliable value.

You're asking about the MUST -- are you thinking it should
be a SHOULD?  IPv6 gets pretty mad when breaking up occurs,
AFAIK.  It basically flags down the communication channel
over ICMPv6.

-Rick

On Nov 26, 2011, at 1:49 PM, Rick van Rein wrote:

> Hi Joost,
> 
>> In section 3.5 it is stated tunnel MTU is set to 1280 as a MUST. What's the reasoning behind this requirement?
> 
> This is the minimum MTU that's guaranteed for IPv6 traffic.
> In an anycast system, there's no way of knowing what server
> will handle the traffic, which could lead to trouble.  So
> instead of risking that, I've specified the minimum MTU as
> the reliable value.
> 
> You're asking about the MUST -- are you thinking it should
> be a SHOULD?  IPv6 gets pretty mad when breaking up occurs,
> AFAIK.  It basically flags down the communication channel
> over ICMPv6.

I don't see a benefit in deviating from the IPv6 protocol here and would say a minimum MTU of 1280 is required. If the MTU is too large, let the v6 layer handle the problem.

Joost

On 26/11/11 13:02, Joost Lek wrote:
> On Nov 26, 2011, at 1:49 PM, Rick van Rein wrote:
>
> I don't see a benefit in deviating from the IPv6 protocol here and would say a minimum MTU of 1280 is required. If the MTU is too large, let the v6 layer handle the problem.

That could lead to a relay reassembling IPv4 packets before forwarding 
them, the IPv6 MTU should be limited to IPv4 MTU-overhead. The client 
MUST perform IPv4 MTU discovery reliably, this is difficult given some 
NAT routers don't issue ICMP Fragmentation Needed errors, but instead 
adjust the TCP MSS.

The route from the relay to client is more difficult, IPv4 MTU discovery 
by the relay is impossible, and there is a risk that packets from 
different relays will have different identification fields and mixed up 
on reassembly.  Perhaps the client could respond to IPv4 fragments with 
IPv6 Too Big.

On Nov 30, 2011, at 12:13 AM, Timothy Baldwin wrote:
> That could lead to a relay reassembling IPv4 packets before forwarding 
> them, the IPv6 MTU should be limited to IPv4 MTU-overhead. The client 
> MUST perform IPv4 MTU discovery reliably, this is difficult given some 
> NAT routers don't issue ICMP Fragmentation Needed errors, but instead 
> adjust the TCP MSS.
> 
> The route from the relay to client is more difficult, IPv4 MTU discovery 
> by the relay is impossible, and there is a risk that packets from 
> different relays will have different identification fields and mixed up 
> on reassembly.  Perhaps the client could respond to IPv4 fragments with 
> IPv6 Too Big.

I agree that the ideal situation would be to determine the IPv4 MTU and then set the IPv6 MTU to that, substracted by IPv4+UDP overhead. But as you already stated, determining IPv4 MTU is difficult at best, and will regularly prove to be impossible. With that option unavailable I agree with Rick in his view that 1280 should be a very reasonable default.

On the subject of fragmented v4 packets: won't that be transparent for the tunnel client unless the DF bit is set? I can see that there will be a performance penalty to pay, but I think (reliable) connectivity should be the first priority.

Joost

On 30/11/11 09:22, Joost Lek wrote:
> On Nov 30, 2011, at 12:13 AM, Timothy Baldwin wrote:
>> That could lead to a relay reassembling IPv4 packets before forwarding
>> them, the IPv6 MTU should be limited to IPv4 MTU-overhead. The client
>> MUST perform IPv4 MTU discovery reliably, this is difficult given some
>> NAT routers don't issue ICMP Fragmentation Needed errors, but instead
>> adjust the TCP MSS.
>>
>> The route from the relay to client is more difficult, IPv4 MTU discovery
>> by the relay is impossible, and there is a risk that packets from
>> different relays will have different identification fields and mixed up
>> on reassembly.  Perhaps the client could respond to IPv4 fragments with
>> IPv6 Too Big.

That won't work reliably, as middleboxes often reassemble packets.
> I agree that the ideal situation would be to determine the IPv4 MTU and then set the IPv6 MTU to that, substracted by IPv4+UDP overhead. But as you already stated, determining IPv4 MTU is difficult at best, and will regularly prove to be impossible. With that option unavailable I agree with Rick in his view that 1280 should be a very reasonable default.
>
> On the subject of fragmented v4 packets: won't that be transparent for the tunnel client unless the DF bit is set? I can see that there will be a performance penalty to pay, but I think (reliable) connectivity should be the first priority.

Unless the relay implements RFC4821 (and is therefore not stateless), I 
agree the best option is for it set DF=0. The point I was making is that 
IPv4 Fragmentation should be avoided by setting a small IPv6 MTU. 
Traditional path MTU discovery does not work for Anycast sources, the 
ICMP Fragmentation Needed packets will go to the wrong relay.

IPv4 fragment reassembly is unreliable for anycast sources, consider the 
case of two relays each sending a packet to the same 6bed4 client, the 
IPv4 addresses of these packets will be identical, and there would be a 
1 in 65536 the Identification field will be identical; if this occurs 
the IPv4 stack may incorrectly re-assemble the packets. This subject has 
been discussed extensively in relation to 6RD.

It would be reasonable for the client to honor any MTU option in the 
router advertisement, so that operators of local service profile relays 
may increase the MTU.

Hello Timothy,

Welcome :)

> > I don't see a benefit in deviating from the IPv6 protocol here and would say a minimum MTU of 1280 is required. If the MTU is too large, let the v6 layer handle the problem.
> 
> That could lead to a relay reassembling IPv4 packets before forwarding 
> them, the IPv6 MTU should be limited to IPv4 MTU-overhead.

Packet fragmentation at the IPv4 layer is always possible.  IPv4
was dimensioned for packets up to 576 bytes (after reassembly) as
per RFC 791 and may quietly drop larger ones.  This deviates from
current practice, so it is a bit of a theoretical problem.

What this means however, is that an MTU in excess of 576 is never
safe to assume; IPv6 on the other hand, requires a minimum MTU of
1280.  Combining these spells disaster for tunnels -- in theory.

In practice, this should not be much of a problem.  Virtuall all
current Internet is run over MTU 1500 networks, so fragmentation
at the IPv4 level would be highly exceptional.

Fragmantation can occur within IPv6, even after path discovery.
This is the result of routing dynamicity.  It is a risk that every
IPv6 layer should take into account.  But it is also true that
6bed4 increases that dynamicity.

> The client 
> MUST perform IPv4 MTU discovery reliably, this is difficult given some 
> NAT routers don't issue ICMP Fragmentation Needed errors, but instead 
> adjust the TCP MSS.

You are saying that the 6bed4 tunnel would have to perform IPv4 MTU
discovery if it were to support higher MTU's for IPv6, right?  Are
you suggesting that 6bed4 needs to do more than merely pass the
MTU discovery attempts from a using IPv6 layer?

> The route from the relay to client is more difficult, IPv4 MTU discovery 
> by the relay is impossible, and there is a risk that packets from 
> different relays will have different identification fields and mixed up 
> on reassembly.  Perhaps the client could respond to IPv4 fragments with 
> IPv6 Too Big.

I'm afraid that IPv4-level fragmentation, which can arise as soon as a
path element has an MTU below 1500, is a fact of life that cannot be
avoided.  I don't think this problem can be solved in the tunnel.  It
is one of the ways of saying "native IPv6 is better".

Cheers,
 -Rick

On 30/11/11 10:03, Rick van Rein wrote:
> Hello Timothy,
>
> Welcome :)
>
>>> I don't see a benefit in deviating from the IPv6 protocol here and would say a minimum MTU of 1280 is required. If the MTU is too large, let the v6 layer handle the problem.
>> That could lead to a relay reassembling IPv4 packets before forwarding
>> them, the IPv6 MTU should be limited to IPv4 MTU-overhead.
> Packet fragmentation at the IPv4 layer is always possible.  IPv4
> was dimensioned for packets up to 576 bytes (after reassembly) as
> per RFC 791 and may quietly drop larger ones.  This deviates from
> current practice, so it is a bit of a theoretical problem.
>
> What this means however, is that an MTU in excess of 576 is never
> safe to assume; IPv6 on the other hand, requires a minimum MTU of
> 1280.  Combining these spells disaster for tunnels -- in theory.
>
> In practice, this should not be much of a problem.  Virtuall all
> current Internet is run over MTU 1500 networks, so fragmentation
> at the IPv4 level would be highly exceptional.
>
> Fragmantation can occur within IPv6, even after path discovery.
> This is the result of routing dynamicity.  It is a risk that every
> IPv6 layer should take into account.  But it is also true that
> 6bed4 increases that dynamicity.
>
>> The client
>> MUST perform IPv4 MTU discovery reliably, this is difficult given some
>> NAT routers don't issue ICMP Fragmentation Needed errors, but instead
>> adjust the TCP MSS.
> You are saying that the 6bed4 tunnel would have to perform IPv4 MTU
> discovery if it were to support higher MTU's for IPv6, right?  Are
> you suggesting that 6bed4 needs to do more than merely pass the
> MTU discovery attempts from a using IPv6 layer?

If IPv4 MTU discovery does not happen, it will not pass the information 
on to IPv6. It is impossible to translate IPv4 Fragmentation Needed to 
IPv6 reliably. The only alternative to IPv4 MTU discovery is to set DF=1 
and let the routers fragment as required.

Now consider that it is common to find hosts connected to Ethernet LAN 
with a 1500 byte MTU, but after the local (NAT) router the connection is 
PPP over Ethernet, or some other link with a smaller MTU. If the 6bed4 
clients choose a MTU of 1472 to just fit the Ethernet MTU, this will 
result in lots of IPv4 fragments which the relay will need to 
reassemble. This is far from the envisioned stateless relay service, and 
will require a far more costly hardware.

Therefore either a small MTU should be used, or IPv4 path MTU detection. 
This MTU could be larger than the IPv6 minimum.
> I'm afraid that IPv4-level fragmentation, which can arise as soon as a
> path element has an MTU below 1500, is a fact of life that cannot be
> avoided.  I don't think this problem can be solved in the tunnel.  It
> is one of the ways of saying "native IPv6 is better".
It can be avoided by ensuring the IPv6 MTU is small enough as per the 
draft, for 6bed4 it is only unavoidable if the IPv4 path MTU is less 
than 1308.  This is most likely.

Hello Tim,

Thanks for getting into this!  Fragmentation is something that
I don't excell at, so this is really useful input.  I'm trying
to summerise the situation, and conclude in A to E below what
seems to be the best approach.  Do you agree?

As I understand from IPv4:
 1. IPv4 permits silently dropping packets >576 bytes during fragment reassembly
 2. IPv4 path MTU discovery uses the DF-bit to avoid fragmentation
 3. If the MTU of a leg along the path is exceeded, an ICMPv4 error is sent back
	(type 3, code 4)
 4. Routing is dynamic, so this can principally occur at any time

Furthermore, for IPv6:
 5. IPv6 packets can only be fragmented and reassembled in end-points
 6. An IPv6 packet that won't fit an MTU results in ICMPv6 (type 2, code 0)
 7. Routing is dynamic, so this can principally occur at any time

From this, I'm tempted to conclude:

A. Due to 1. tunnel packets could silently but consistently disappear
_unless_ DF is set.  Since that is unacceptable, the IPv4 level MUST
set the DF bit.

B. With DF set, an ICMPv4 message informs the sender if an MTU along the path
is being exceeded.  This ICMPv4 message (type 3, code 4) could be translated
to the corresponding ICMPv6 message (type 2, code 0) to inform the
IPv6 layer about fragmentation requirements.  (I hope there will be enough
data available in the ICMPv4 message to make that possible, by the way.)

C. This means that the IPv4 layer never does any packet reassembly at all,
which is rather desirable for the packets to flow through quickly and
statelessly.  In addition, 6bed4 can process fragmentation issues in
a stateless manner, namely by direct translation of ICMP messages.

D. From this, it would follow that an MTU in excess of 1280 is possible,
as long as the IPv6 client is aware of the risks involved, of needing to
fragment the packet.  In other words, 1280 is not a "MUST" for the MTU of
6bed4, but rather a "SHOULD" with an explanation/warning against higher
MTU values.

E. A remaining problem is that the minimum MTU for IPv6 is 1280 and IPv4
does not guarantee that, let alone in a tunnel, so packets of 1280 or
even less might lead to ICMPv6 type 2, code 0 messages.  That is a
nuisance, except that it won't happen in practice as the Internet has
support for near-1500 packet sizes.

F. An alternative would be to set DF only on packets with an IPv6 MTU
in excess of 1280, and to rely on packet fragmentation to help a few
more IPv6 packets through IPv4 than would be possible with DF set.
I'd rather not: it adds complexity; it is ugly; it is not a complete
solution; an IPv6 application may well be able to scale down its
packets to an MTU below 1280 and achieve a much greater chance of
success.

Do you agree with A to E?  How do you feel about F?

Thanks,
 -Rick

On 05/12/11 23:26, Rick van Rein wrote:
> Hello Tim,
>
> Thanks for getting into this!  Fragmentation is something that
> I don't excell at, so this is really useful input.  I'm trying
> to summerise the situation, and conclude in A to E below what
> seems to be the best approach.  Do you agree?
>
> As I understand from IPv4:
>   1. IPv4 permits silently dropping packets>576 bytes during fragment reassembly
Yes, but one will not be using an IPv6 tunnel on such nodes, all major 
operating systems support larger packets.
>   2. IPv4 path MTU discovery uses the DF-bit to avoid fragmentation
Yes
>   3. If the MTU of a leg along the path is exceeded, an ICMPv4 error is sent back
> 	(type 3, code 4)
It should happen, the standard says "MUST", but it often does not. How 
often?
>   4. Routing is dynamic, so this can principally occur at any time
Yes, however it is common for the small MTU to occur over statically 
routed links to single-homed customers.
> Furthermore, for IPv6:
>   5. IPv6 packets can only be fragmented and reassembled in end-points
Yes
>   6. An IPv6 packet that won't fit an MTU results in ICMPv6 (type 2, code 0)
Yes
>   7. Routing is dynamic, so this can principally occur at any time
>
>  From this, I'm tempted to conclude:
>
> A. Due to 1. tunnel packets could silently but consistently disappear
> _unless_ DF is set.  Since that is unacceptable, the IPv4 level MUST
> set the DF bit.
No, unfortunately the opposite is the case.
> B. With DF set, an ICMPv4 message informs the sender if an MTU along the path
> is being exceeded.  This ICMPv4 message (type 3, code 4) could be translated
> to the corresponding ICMPv6 message (type 2, code 0) to inform the
> IPv6 layer about fragmentation requirements.  (I hope there will be enough
> data available in the ICMPv4 message to make that possible, by the way.)
It's usually not translatable, RFC792 states that 64 bits of the IP 
payload MUST be included, that is just the UDP header. Whilst RFC1812 
states that as much as possible SHOULD be included, the reality is that 
just 8 bytes is common.

> C. This means that the IPv4 layer never does any packet reassembly at all,
No.
> which is rather desirable for the packets to flow through quickly and
> statelessly.
It's also desirable as IPv4 fragment reassembly is unreliable with 
multiple anycast sources, the beginning of packet A could be attached to 
the end of packet B.
> In addition, 6bed4 can process fragmentation issues in
> a stateless manner, namely by direct translation of ICMP messages.
No, see above.
> D. From this, it would follow that an MTU in excess of 1280 is possible,
> as long as the IPv6 client is aware of the risks involved, of needing to
> fragment the packet.
Backwards, setting a small IPv6 MTU increases the risk of IPv6 
fragmentation being needed, but reduces IPv4 fragmentation. Reducing the 
MTU at endpoints avoids the risk of mis-configured firewalls dropping 
ICMPv6 Too Big as TCP will never attempt to send a packet larger than 
the MTU at either end, reducing the MTU at a router increases the risk 
of problems. A typical TCP implementation never sends fragmented packets.

Did you mean the risk of having to retransmit?
> In other words, 1280 is not a "MUST" for the MTU of
> 6bed4, but rather a "SHOULD" with an explanation/warning against higher
> MTU values.
>
> E. A remaining problem is that the minimum MTU for IPv6 is 1280 and IPv4
> does not guarantee that, let alone in a tunnel, so packets of 1280 or
> even less might lead to ICMPv6 type 2, code 0 messages.  That is a
> nuisance, except that it won't happen in practice as the Internet has
> support for near-1500 packet sizes.
In that case we just let IPv4 fragmentation happen.
> F. An alternative would be to set DF only on packets with an IPv6 MTU
> in excess of 1280, and to rely on packet fragmentation to help a few
> more IPv6 packets through IPv4 than would be possible with DF set.
> I'd rather not: it adds complexity; it is ugly; it is not a complete
> solution; an IPv6 application may well be able to scale down its
> packets to an MTU below 1280 and achieve a much greater chance of
> success.
A ICMPv6 Too Big packet will not reliably result in smaller than 1280 
byte packets, it certainly will not cause IPv6 layer fragmentation to 
smaller packets, it only results in a fragment header being perpended so 
that the packet can be fragmented after a IPv6 to IPv4 translator.

RFC 1918 states:

    A node MUST NOT reduce its estimate of the Path MTU below the IPv6
    minimum link MTU.

RFC 2460 states:

    IPv6 requires that every link in the internet have an MTU of 1280
    octets or greater.  On any link that cannot convey a 1280-octet
    packet in one piece, link-specific fragmentation and reassembly must
    be provided at a layer below IPv6.

I suggest that the client MUST either honor the MTU specified in the 
router advertisement, which SHOULD be low enough to avoid excessive IPv4 
fragmentation, or use IPv4 path MTU discovery, it SHOULD take 
precautions against IPv4 PMTU black-holes. The Relay MAY respond to IPv4 
fragmentation by sending ICMPv6 Too Big or a router advertisement with a 
lower MTU.

Relays SHOULD have a IPv6 MTU of 1280 for sending encapsulated packets 
unless either they perform IPv4 path MTU discovery (which will 
often/usually be impossible). Otherwise using larger packets will 
increase the risk of packet fragmentation and mis-reassembly due 
identification field collision.

I do not think more complicated schemes are worthwhile to decrease the 
overhead from 7.91% to 7.13% (on Ethernet), compared to 5.32% for native 
IPv6 and 4.02% for IPv4.

On 26/11/11 13:02, Joost Lek wrote:
> On Nov 26, 2011, at 1:49 PM, Rick van Rein wrote:
>
> I don't see a benefit in deviating from the IPv6 protocol here and would say a minimum MTU of 1280 is required. If the MTU is too large, let the v6 layer handle the problem.

That could lead to a relay reassembling IPv4 packets before forwarding 
them, the IPv6 MTU should be limited to IPv4 MTU-overhead. The client 
MUST perform IPv4 MTU discovery reliably, this is difficult given some 
NAT routers don't issue ICMP Fragmentation Needed errors, but instead 
adjust the TCP MSS.

The route from the relay to client is more difficult, IPv4 MTU discovery 
by the relay is impossible, and there is a risk that packets from 
different relays will have different identification fields and mixed up 
on reassembly.  Perhaps the client could respond to IPv4 fragments with 
IPv6 Too Big.

Thread: [6bed4-devel] Tunnel MTU fixed at 1280

zeroconfig IPv6 tunnel

tun6bed4-devel