Path MTU Discovery for IP version 6
draft-ietf-6man-rfc1981bis-00
The information below is for an old version of the document.
Document | Type |
This is an older version of an Internet-Draft that was ultimately published as RFC 8201.
|
|
---|---|---|---|
Authors | Jack McCann , Dr. Steve E. Deering , Jeffrey Mogul , Bob Hinden | ||
Last updated | 2016-03-16 (Latest revision 2016-03-03) | ||
RFC stream | Internet Engineering Task Force (IETF) | ||
Formats | |||
Reviews |
GENART Telechat review
(of
-06)
by Stewart Bryant
Ready w/issues
GENART Last Call review
(of
-04)
by Stewart Bryant
Almost ready
INTDIR Early review
(of
-03)
by Donald Eastlake
Ready w/nits
|
||
Additional resources | Mailing list discussion | ||
Stream | WG state | WG Document | |
Document shepherd | (None) | ||
IESG | IESG state | Became RFC 8201 (Internet Standard) | |
Consensus boilerplate | Unknown | ||
Telechat date | (None) | ||
Responsible AD | (None) | ||
Send notices to | (None) |
draft-ietf-6man-rfc1981bis-00
quot;reserved" and is older than the timeout interval: - The PMTU estimate is set to the MTU of the first hop link. - The timestamp is set to the "reserved" value. - Packetization layers using this path are notified of the increase. 5.4. TCP layer actions The TCP layer must track the PMTU for the path(s) in use by a connection; it should not send segments that would result in packets larger than the PMTU. A simple implementation could ask the IP layer McCann, et al. Expires September 4, 2016 [Page 10] Internet-Draft IPv6 Path MTU Disc March 2016 for this value each time it created a new segment, but this could be inefficient. Moreover, TCP implementations that follow the "slow- start" congestion-avoidance algorithm [CONG] typically calculate and cache several other values derived from the PMTU. It may be simpler to receive asynchronous notification when the PMTU changes, so that these variables may be updated. A TCP implementation must also store the MSS value received from its peer, and must not send any segment larger than this MSS, regardless of the PMTU. In 4.xBSD-derived implementations, this may require adding an additional field to the TCP state record. The value sent in the TCP MSS option is independent of the PMTU. This MSS option value is used by the other end of the connection, which may be using an unrelated PMTU value. See [I-D.ietf-6man-rfc2460bis] sections "Packet Size Issues" and "Maximum Upper-Layer Payload Size" for information on selecting a value for the TCP MSS option. When a Packet Too Big message is received, it implies that a packet was dropped by the node that sent the ICMP message. It is sufficient to treat this as any other dropped segment, and wait until the retransmission timer expires to cause retransmission of the segment. If the Path MTU Discovery process requires several steps to find the PMTU of the full path, this could delay the connection by many round- trip times. Alternatively, the retransmission could be done in immediate response to a notification that the Path MTU has changed, but only for the specific connection specified by the Packet Too Big message. The packet size used in the retransmission should be no larger than the new PMTU. Note: A packetization layer must not retransmit in response to every Packet Too Big message, since a burst of several oversized segments will give rise to several such messages and hence several retransmissions of the same data. If the new estimated PMTU is still wrong, the process repeats, and there is an exponential growth in the number of superfluous segments sent. This means that the TCP layer must be able to recognize when a Packet Too Big notification actually decreases the PMTU that it has already used to send a packet on the given connection, and should ignore any other notifications. Many TCP implementations incorporate "congestion avoidance" and "slow-start" algorithms to improve performance [CONG]. Unlike a retransmission caused by a TCP retransmission timeout, a McCann, et al. Expires September 4, 2016 [Page 11] Internet-Draft IPv6 Path MTU Disc March 2016 retransmission caused by a Packet Too Big message should not change the congestion window. It should, however, trigger the slow-start mechanism (i.e., only one segment should be retransmitted until acknowledgements begin to arrive again). TCP performance can be reduced if the sender's maximum window size is not an exact multiple of the segment size in use (this is not the congestion window size, which is always a multiple of the segment size). In many systems (such as those derived from 4.2BSD), the segment size is often set to 1024 octets, and the maximum window size (the "send space") is usually a multiple of 1024 octets, so the proper relationship holds by default. If Path MTU Discovery is used, however, the segment size may not be a submultiple of the send space, and it may change during a connection; this means that the TCP layer may need to change the transmission window size when Path MTU Discovery changes the PMTU value. The maximum window size should be set to the greatest multiple of the segment size that is less than or equal to the sender's buffer space size. 5.5. Issues for other transport protocols Some transport protocols (such as ISO TP4 [ISOTP]) are not allowed to repacketize when doing a retransmission. That is, once an attempt is made to transmit a segment of a certain size, the transport cannot split the contents of the segment into smaller segments for retransmission. In such a case, the original segment can be fragmented by the IP layer during retransmission. Subsequent segments, when transmitted for the first time, should be no larger than allowed by the Path MTU. The Sun Network File System (NFS) uses a Remote Procedure Call (RPC) protocol [RPC] that, when used over UDP, in many cases will generate payloads that must be fragmented even for the first-hop link. This might improve performance in certain cases, but it is known to cause reliability and performance problems, especially when the client and server are separated by routers. It is recommended that NFS implementations use Path MTU Discovery whenever routers are involved. Most NFS implementations allow the RPC datagram size to be changed at mount-time (indirectly, by changing the effective file system block size), but might require some modification to support changes later on. Also, since a single NFS operation cannot be split across several UDP datagrams, certain operations (primarily, those operating on file names and directories) require a minimum payload size that if sent in a single packet would exceed the PMTU. NFS implementations should not reduce the payload size below this threshold, even if Path MTU McCann, et al. Expires September 4, 2016 [Page 12] Internet-Draft IPv6 Path MTU Disc March 2016 Discovery suggests a lower value. In this case the payload will be fragmented by the IP layer. 5.6. Management interface It is suggested that an implementation provide a way for a system utility program to: - Specify that Path MTU Discovery not be done on a given path. - Change the PMTU value associated with a given path. The former can be accomplished by associating a flag with the path; when a packet is sent on a path with this flag set, the IP layer does not send packets larger than the IPv6 minimum link MTU. These features might be used to work around an anomalous situation, or by a routing protocol implementation that is able to obtain Path MTU values. The implementation should also provide a way to change the timeout period for aging stale PMTU information. 6. Security Considerations This Path MTU Discovery mechanism makes possible two denial-of- service attacks, both based on a malicious party sending false Packet Too Big messages to a node. In the first attack, the false message indicates a PMTU much smaller than reality. This should not entirely stop data flow, since the victim node should never set its PMTU estimate below the IPv6 minimum link MTU. It will, however, result in suboptimal performance. In the second attack, the false message indicates a PMTU larger than reality. If believed, this could cause temporary blockage as the victim sends packets that will be dropped by some router. Within one round-trip time, the node would discover its mistake (receiving Packet Too Big messages from that router), but frequent repetition of this attack could cause lots of packets to be dropped. A node, however, should never raise its estimate of the PMTU based on a Packet Too Big message, so should not be vulnerable to this attack. A malicious party could also cause problems if it could stop a victim from receiving legitimate Packet Too Big messages, but in this case there are simpler denial-of-service attacks available. McCann, et al. Expires September 4, 2016 [Page 13] Internet-Draft IPv6 Path MTU Disc March 2016 7. Acknowledgements We would like to acknowledge the authors of and contributors to [RFC1191], from which the majority of this document was derived. We would also like to acknowledge the members of the IPng working group for their careful review and constructive criticisms. 8. IANA Considerations This document does not have any IANA actions 9. References 9.1. Normative References [I-D.ietf-6man-rfc2460bis] Deering, S. and B. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", draft-ietf-6man-rfc2460bis-03 (work in progress), January 2016. [ICMPv6] Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 4443, DOI 10.17487/RFC4443, March 2006, <http://www.rfc-editor.org/info/rfc4443>. 9.2. Informative References [CONG] Jacobson, V., "Congestion Avoidance and Control", Proc. SIGCOMM '88 Symposium on Communications Architectures and Protocols , August 1988. [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in Computer Communications Technology , August 1987. [ISOTP] "ISO Transport Protocol specification ISO DP 8073", RFC 905, DOI 10.17487/RFC0905, April 1984, <http://www.rfc-editor.org/info/rfc905>. [ND] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, DOI 10.17487/RFC4861, September 2007, <http://www.rfc-editor.org/info/rfc4861>. [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, DOI 10.17487/RFC1191, November 1990, <http://www.rfc-editor.org/info/rfc1191>. McCann, et al. Expires September 4, 2016 [Page 14] Internet-Draft IPv6 Path MTU Disc March 2016 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, <http://www.rfc-editor.org/info/rfc4821>. [RPC] Sun Microsystems, "RPC: Remote Procedure Call Protocol specification: Version 2", RFC 1057, DOI 10.17487/RFC1057, June 1988, <http://www.rfc-editor.org/info/rfc1057>. Appendix A. Comparison to RFC 1191 This document is based in large part on RFC 1191, which describes Path MTU Discovery for IPv4. Certain portions of RFC 1191 were not needed in this document: router specification Packet Too Big messages and corresponding router behavior are defined in [ICMPv6] Don't Fragment bit there is no DF bit in IPv6 packets TCP MSS discussion selecting a value to send in the TCP MSS option is discussed in [I-D.ietf-6man-rfc2460bis] old-style messages all Packet Too Big messages report the MTU of the constricting link MTU plateau tables not needed because there are no old-style messages Appendix B. Changes Since RFC 1981 This document has the following changes from RFC1981. Numbers identify the Internet-Draft version that the change was made.: Working Group Internet Drafts 00) Added text to discard an ICMP Packet Too Big message containing an MTU less than the IPv6 minimum link MTU. 00) Revision of text regarding RFC4821. 00) Added R. Hinden as Editor to facilitate ID submission. 00) Editorial changes. Individual Internet Drafts McCann, et al. Expires September 4, 2016 [Page 15] Internet-Draft IPv6 Path MTU Disc March 2016 01) Remove Note about a Packet Too Big message reporting a next- hop MTU that is less than the IPv6 minimum link MTU. This was removed from [I-D.ietf-6man-rfc2460bis]. 01) Include a link to RFC4821 along with a short summary of what it does. 01) Assigned references to informative and normative. 01) Editorial changes. 00) Establish a baseline from RFC1981. The only intended changes are formatting (XML is slightly different from .nroff), differences between an RFC and Internet Draft, fixing a few ID Nits, updating references, and updates to the authors information. There should not be any content changes to the specification. Authors' Addresses Jack McCann Digital Equipment Corporation Stephen E. Deering Retired Vancouver, British Columbia Canada Jeffrey Mogul Digital Equipment Corporation Robert M. Hinden (editor) Check Point Software 959 Skyway Road San Carlos, CA 94070 USA Email: bob.hinden@gmail.com McCann, et al. Expires September 4, 2016 [Page 16]