RFC 2544 Applicability Statement: Use on Production Networks Considered Harmful
draft-ietf-bmwg-2544-as-02
The information below is for an old version of the document.
Document | Type |
This is an older version of an Internet-Draft that was ultimately published as RFC 6815.
|
|
---|---|---|---|
Authors | Scott O. Bradner , Kevin Dubray , Jim McQuaid , Al Morton | ||
Last updated | 2012-04-25 (Latest revision 2012-03-12) | ||
Replaces | draft-chairs-bmwg-2544-as | ||
RFC stream | Internet Engineering Task Force (IETF) | ||
Formats | |||
Reviews | |||
Additional resources | Mailing list discussion | ||
Stream | WG state | WG Document | |
Document shepherd | (None) | ||
IESG | IESG state | Became RFC 6815 (Informational) | |
Consensus boilerplate | Unknown | ||
Telechat date | (None) | ||
Responsible AD | Ron Bonica | ||
IESG note | ** No value found for 'doc.notedoc.note' ** | ||
Send notices to | bmwg-chairs@tools.ietf.org, draft-ietf-bmwg-2544-as@tools.ietf.org, wcerveny@wjcerveny.com |
draft-ietf-bmwg-2544-as-02
Network Working Group S. Bradner Internet-Draft Harvard University Intended status: Informational K. Dubray Expires: September 13, 2012 Juniper Networks J. McQuaid Turnip Video A. Morton AT&T Labs March 12, 2012 RFC 2544 Applicability Statement: Use on Production Networks Considered Harmful draft-ietf-bmwg-2544-as-02 Abstract Benchmarking Methodology Working Group (BMWG) has been developing key performance metrics and laboratory test methods since 1990, and continues this work at present. Recent application of the methods beyond their intended scope is cause for concern. This memo clarifies the scope of RFC 2544 and other benchmarking work for the IETF community. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 13, 2012. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents Bradner, et al. Expires September 13, 2012 [Page 1] Internet-Draft RFC 2544 AS March 2012 (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 2. Scope and Goals . . . . . . . . . . . . . . . . . . . . . . . . 4 3. The Concept of an Isolated Test Environment . . . . . . . . . . 4 4. Why RFC 2544 Methods are intended for ITE . . . . . . . . . . . 4 4.1. Experimental Control, Repeatability, and Accuracy . . . . . 4 4.2. Containment of Implementation Failure Impact . . . . . . . 5 5. Advisory on RFC 2544 Methods in Real-world Networks . . . . . . 5 6. What to do without RFC 2544? . . . . . . . . . . . . . . . . . 6 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 10.1. Normative References . . . . . . . . . . . . . . . . . . . 7 10.2. Informative References . . . . . . . . . . . . . . . . . . 8 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 Bradner, et al. Expires September 13, 2012 [Page 2] Internet-Draft RFC 2544 AS March 2012 1. Introduction This memo clarifies the scope of RFC 2544 [RFC2544], and other benchmarking work for the IETF community. Benchmarking Methodologies (beginning with [RFC2544]) have always relied on test conditions that can only be produced and replicated reliably in the laboratory. Thus it was surprising to find that this foundation methodology was being cited in several unintended applications, such as: 1. Validation of telecommunication service configuration, such as the Committed Information Rate (CIR). 2. Validation of performance metrics in a telecommunication Service Level Agreement (SLA), such as frame loss and latency. 3. As an integral part of telecommunication service activation testing, where traffic that shares network resources with the test might be adversely affected. Above, we distinguish "telecommunication service" (where a network service provider contracts with a customer to transfer information between specified interfaces at different geographic locations) from the generic term "service". Also, we use the adjective "production" to refer to networks carrying live user traffic. [RFC2544] used the term "real-world" to refer to production networks and to differentiate them from test networks. Although RFC 2544 is held up as the standard reference for such testing, we believe that the actual methods used vary from RFC 2544 in significant ways. Since the only citation is to RFC 2544, the modifications are opaque to the standards community and to users in general (an undesirable situation). To directly address this situation, the past and present Chairs of the IETF Benchmarking Methodology Working Group (BMWG) have prepared this Applicability Statement for RFC 2544. 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Bradner, et al. Expires September 13, 2012 [Page 3] Internet-Draft RFC 2544 AS March 2012 2. Scope and Goals This memo clarifies the scope of [RFC2544], with the goal to provide guidance to the community on its applicability, which is limited to laboratory testing. 3. The Concept of an Isolated Test Environment An Isolated Test Environment (ITE) used with [RFC2544] methods (as illustrated in Figures 1 through 3 of [RFC2544])has the ability to: o contain the test streams to paths within the desired set-up o prevent non-test traffic from traversing the test set-up These features allow unfettered experimentation, while at the same time protecting equipment management LANs and other production networks from the unwanted effects of the test traffic. 4. Why RFC 2544 Methods are intended for ITE The following sections discuss some of the reasons why RFC 2544 [RFC2544] methods were intended only for isolated laboratory use, and the difficulties of applying these methods outside the lab environment. 4.1. Experimental Control, Repeatability, and Accuracy All of the tests described in RFC 2544 assume that the tester and device under test are the only devices on the networks that are transmitting data. The presence of other unwanted traffic on the network would mean that the specified test conditions have not been achieved. Assuming that the unwanted traffic appears in variable amounts over time, the repeatability of any test result will likely depend to some degree on the unwanted traffic. The presence of unwanted or unknown traffic makes accurate, repeatable, and consistent measurements of the performance of the device under test very unlikely, since the actual test conditions will not be reported. For example, the RFC 2544 Throughput Test attempts to characterize a maximum reliable load, thus there will be testing above the maximum that causes packet/frame loss. Any other sources of traffic on the Bradner, et al. Expires September 13, 2012 [Page 4] Internet-Draft RFC 2544 AS March 2012 network will cause packet loss to occur at a tester data rate lower than the rate that would be achieved without the extra traffic. 4.2. Containment of Implementation Failure Impact RFC 2544 methods, specifically to determine Throughput as defined in [RFC1242] and other benchmarks, may overload the resources of the device under test, and may cause failure modes in the device under test. Since failures can become the root cause of more wide-spread failure, it is clearly desirable to contain all test traffic within the ITE. In addition, such testing can have a negative affect on any traffic which shares resources with the test stream(s) since, in most cases, the traffic load will be close to the capacity of the network links. Appendix C.2.2 of [RFC2544] (as adjusted by errata) gives the private IPv4 address range for testing: "...The network addresses 198.18.0.0 through 198.19.255.255 have been assigned to the BMWG by the IANA for this purpose. This assignment was made to minimize the chance of conflict in case a testing device were to be accidentally connected to part of the Internet. The specific use of the addresses is detailed below.&", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Protocol Updates Discarding the NCE after three packets spaced one second apart is only needed when an alternative neighbor is available, such as an additional default router or discarding an NCE created by a Redirect. If an implementation transmits more than MAX_UNICAST_SOLICIT/ MAX_MULTICAST_SOLICIT packets, then it SHOULD use the exponential backoff of the retransmit timer. This is to avoid any significant load due to a steady background level of retransmissions from implementations that retransmit a large number of Neighbor Solicitations (NS) before discarding the NCE. Even if there is no alternative neighbor, the protocol needs to be able to handle the case when the link-layer address of the neighbor/ target has changed by switching to multicast Neighbor Solicitations at some point in time. Nordmark & Gashinsky Standards Track [Page 3] RFC 7048 NUD Is Too Impatient January 2014 In order to capture all the cases above, this document introduces a new UNREACHABLE state in the conceptual model described in [RFC4861]. An NCE in the UNREACHABLE state retains the link-layer address, and IPv6 packets continue to be sent to that link-layer address. But in the UNREACHABLE state, the NUD Neighbor Solicitations are multicast (to the solicited-node multicast address), using a timeout that follows an exponential backoff. In the places where [RFC4861] says to discard/delete the NCE after N probes (Sections 7.3 and 7.3.3, and Appendix C), this document instead specifies a transition to the UNREACHABLE state. If the Neighbor Cache Entry was created by a Redirect message, a node MAY delete the NCE instead of changing its state to UNREACHABLE. In any case, the node SHOULD NOT use an NCE created by a Redirect to send packets if that NCE is in the UNREACHABLE state. Packets should be sent following the next-hop selection algorithm in [RFC4861], Section 5.2, which disregards NCEs that are not reachable. Section 6.3.6 of [RFC4861] indicates that default routers that are "known to be reachable" are preferred. For the purposes of that section, if the NCE for the router is in the UNREACHABLE state, it is not known to be reachable. Thus, the particular text in Section 6.3.6 that says "in any state other than INCOMPLETE" needs to be extended to say "in any state other than INCOMPLETE or UNREACHABLE". Apart from the use of multicast NS instead of unicast NS, and the exponential backoff of the timer, the UNREACHABLE state works the same as the current PROBE state. A node MAY garbage collect a Neighbor Cache Entry at any time as specified in [RFC4861]. This freedom to garbage collect does not change with the introduction of the UNREACHABLE state in the conceptual model. An implementation MAY prefer garbage collecting UNREACHABLE NCEs over other NCEs. There is a non-obvious extension to the state-machine description in Appendix C of [RFC4861] in the case for "NA, Solicited=1, Override=0. Different link-layer address than cached". There we need to add "UNREACHABLE" to the current list of "STALE, PROBE, Or DELAY". That is, the NCE would be unchanged. Note that there is no corresponding change necessary to the text in [RFC4861], Section 7.2.5, since it is phrased using "Otherwise" instead of explicitly listing the three states. Nordmark & Gashinsky Standards Track [Page 4] RFC 7048 NUD Is Too Impatient January 2014 The other state transitions described in Appendix C handle the introduction of the UNREACHABLE state without any change, since they are described using "not INCOMPLETE". There is also the more obvious change already described above. [RFC4861] has this: State Event Action New state PROBE Retransmit timeout, Discard entry - N or more retransmissions. That needs to be replaced by: State Event Action New state PROBE Retransmit timeout, Increase timeout UNREACHABLE N retransmissions. Send multicast NS UNREACHABLE Retransmit timeout Increase timeout UNREACHABLE Send multicast NS The exponential backoff SHOULD be clamped at some reasonable maximum retransmit timeout, such as 60 seconds (see MAX_RETRANS_TIMER below). If there is no IPv6 packet sent using the UNREACHABLE NCE, then it is RECOMMENDED to stop the retransmits of the multicast NS until either the NCE is garbage collected or there are IPv6 packets sent using the NCE. The multicast NS and associated exponential backoff can be applied on the condition of continued use of the NCE to send IPv6 packets to the recorded link-layer address. A node can unicast the first few Neighbor Solicitation messages even while in the UNREACHABLE state, but it MUST switch to multicast Neighbor Solicitations within 60 seconds of the initial retransmission to be able to handle a link-layer address change for the target. The example below shows such behavior. Nordmark & Gashinsky Standards Track [Page 5] RFC 7048 NUD Is Too Impatient January 2014 4. Example Algorithm This section is NOT normative but specifies a simple implementation that conforms with this document. The implementation is described using operator-configurable values that allow it to be configured to be compatible with the retransmission behavior in [RFC4861]. The operator can configure the values for MAX_UNICAST_SOLICIT, MAX_MULTICAST_SOLICIT, RETRANS_TIMER, and the new BACKOFF_MULTIPLE, MAX_RETRANS_TIMER, and MARK_UNREACHABLE. This allows the implementation to be as simple as: next_retrans = ($BACKOFF_MULTIPLE ^ $solicit_retrans_num) * $RetransTimer * $JitterFactor where solicit_retrans_num is zero for the first transmission, and JitterFactor is a random value between MIN_RANDOM_FACTOR and MAX_RANDOM_FACTOR [RFC4861] to avoid any synchronization of transmissions from different hosts. After MARK_UNREACHABLE transmissions, the implementation would mark the NCE UNREACHABLE and as a result explore alternate next hops. After MAX_UNICAST_SOLICIT, the implementation would switch to multicast NUD probes. The behavior of this example algorithm is to have 5 attempts, with time spacing of 0 (initial request), 1 second later, 3 seconds after the first retransmission, then 9, then 27, and switch to UNREACHABLE after the first three transmissions. Thus, relative to the time of the first transmissions, the retransmissions would occur at 1 second, 4 seconds, 13 seconds, and finally 40 seconds. At 4 seconds from the first transmission, the NCE would be marked UNREACHABLE. That behavior corresponds to: MAX_UNICAST_SOLICIT=5 RETRANS_TIMER=1 (default) MAX_RETRANS_TIMER=60 BACKOFF_MULTIPLE=3 MARK_UNREACHABLE=3 After 3 retransmissions, the implementation would mark the NCE UNREACHABLE. That results in trying an alternative neighbor, such as another default router, or ignoring an NCE created by a Redirect as specified in [RFC4861]. With the above values, that would occur after 4 seconds following the first transmission compared to the Nordmark & Gashinsky Standards Track [Page 6] RFC 7048 NUD Is Too Impatient January 2014 2 seconds using the fixed scheme in [RFC4861]. That additional delay is small compared to the default ReachableTime of 30,000 milliseconds. After 5 transmissions, i.e., 40 seconds after the initial transmission, the example behavior is to switch to multicast NUD probes. In the language of the state machine in [RFC4861], that corresponds to the action "Discard entry". Thus, any attempts to send future packets would result in sending multicast NS packets. An implementation MAY retain the backoff value as it switches to multicast NUD probes. The potential downside of deferring switching to multicast is that it would take longer for NUD to handle a change in a link-layer address, i.e., the case when a host or a router changes its link-layer address while keeping the same IPv6 address. However, [RFC4861] says that a node MAY send unsolicited NS to handle that case, which is rather infrequent in operational networks. In any case, the implementation needs to follow the "SHOULD" in Section 3 to switch to multicast solutions within 60 seconds after the initial transmission. If BACKOFF_MULTIPLE=1, MARK_UNREACHABLE=3, and MAX_UNICAST_SOLICIT=3, you would get the same behavior as in [RFC4861]. If the request was not answered at first -- due, for example, to a transitory condition -- an implementation following this algorithm would retry immediately and then back off for progressively longer periods. This would allow for a reasonably fast resolution time when the transitory condition clears. Note that RetransTimer and ReachableTime are by default set from the protocol constants RETRANS_TIMER and REACHABLE_TIME but are overridden by values advertised in Router Advertisements as specified in [RFC4861]. That remains the case even with the protocol updates specified in this document. The key values that the operator would configure are BACKOFF_MULTIPLE, MAX_RETRANS_TIMER, MAX_UNICAST_SOLICIT, and MAX_MULTICAST_SOLICIT. It is useful to have a maximum value for ($BACKOFF_MULTIPLE^$solicit_attempt_num)*$RetransTimer so that the retransmissions are not too far apart. The above value of 60 seconds for this MAX_RETRANS_TIMER is consistent with DHCPv6. 5. Acknowledgements The comments from Thomas Narten, Philip Homburg, Joel Jaeggli, Hemant Singh, Tina Tsou, Suresh Krishnan, and Murray Kucherawy have helped improve this document. Nordmark & Gashinsky Standards Track [Page 7] RFC 7048 NUD Is Too Impatient January 2014 6. Security Considerations Relaxing the retransmission behavior for NUD is believed to have no impact on security. In particular, it doesn't impact the application of Secure Neighbor Discovery [RFC3971]. 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005. [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007. 7.2. Informative References [RFC0826] Plummer, D., "Ethernet Address Resolution Protocol: Or converting network protocol addresses to 48.bit Ethernet address for transmission on Ethernet hardware", STD 37, RFC 826, November 1982. [RFC6583] Gashinsky, I., Jaeggli, J., and W. Kumari, quot; In other words, devices operating on the Internet may be configured to discard any traffic they observe in this address range, as it is intended for laboratory ITE use only. Thus, testers using the assigned testing address ranges MUST NOT be connected to the Internet. We note that a range of IPv6 addresses has been assigned to BMWG for laboratory test purposes, in [RFC5180]. Also, the strong statements in the Security Considerations Section of this memo make the scope even more clear; this is now a standard fixture of all BMWG memos. 5. Advisory on RFC 2544 Methods in Real-world Networks The tests in [RFC2544] were designed to measure the performance of network devices, not of networks, and certainly not production networks carrying user traffic on shared resources. There will be unanticipated difficulties when applying these methods outside the lab environment. Operating test equipment on production networks according to the methods described in [RFC2544], where overload is a possible outcome, would no doubt be harmful to user traffic performance. These tests Bradner, et al. Expires September 13, 2012 [Page 5] Internet-Draft RFC 2544 AS March 2012 MUST NOT be used on production networks and as discussed above, the tests will never produce a reliable or accurate benchmarking result on a production network. [RFC2544] methods have never been validated on a network path, even when that path is not part of a production network and carrying no other traffic. It is unknown whether the tests can be used to measure valid and reliable performance of a multi-device, multi- network path. It is possible that some of the tests may prove valid in some path scenarios, but that work has not been done or has not been shared with the IETF community. Thus, such testing is contra- indicated by the BMWG. 6. What to do without RFC 2544? The IETF has addressed the problem of production network performance measurement by chartering a different working group: IP Performance Metrics (IPPM). This working group has developed a set of standard metrics to assess the quality, performance, and reliability of Internet packet transfer services. These metrics can be measured by network operators, end users, or independent testing groups. We note that some IPPM metrics differ from RFC 2544 metrics with similar names, and there is likely to be confusion if the details are ignored. IPPM has not yet standardized methods for raw capacity measurement of Internet paths. Such testing needs to adequately consider the strong possibility for degradation to any other traffic that may be present due to congestion. There are no specific methods proposed for activation of a packet transfer service in IPPM. Other standards may help to fill gaps in telecommunication service testing. For example, the IETF has many standards intended to assist with network operation, administration and maintenance (OAM), and ITU-T Study Group 12 has a recommendation on service activation test methodology. The world will not spin off axis while waiting for appropriate and standardized methods to emerge from the consensus process. 7. Security Considerations This Applicability Statement is also intended to help preserve the security of the Internet by clarifying that the scope of [RFC2544] and other BMWG memos are all limited to testing in a laboratory ITE, thus avoiding accidental Denial of Service attacks or congestion due Bradner, et al. Expires September 13, 2012 [Page 6] Internet-Draft RFC 2544 AS March 2012 to high traffic volume test streams. All Benchmarking activities are limited to technology characterization using controlled stimuli in a laboratory environment, with dedicated address space and the other constraints [RFC2544]. The benchmarking network topology will be an independent test setup and MUST NOT be connected to devices that may forward the test traffic into a production network, or misroute traffic to the test management network. Further, benchmarking is performed on a "black-box" basis, relying solely on measurements observable external to the device under test/ system under test (DUT/SUT). Special capabilities SHOULD NOT exist in the DUT/SUT specifically for benchmarking purposes. Any implications for network security arising from the DUT/SUT SHOULD be identical in the lab and in production networks. 8. IANA Considerations This memo makes no requests of IANA, and hopes that IANA will leave it alone as well. 9. Acknowledgements Thanks to Matt Zekauskas, Bill Cerveny, Barry Constantine, Curtis Villamizar, and David Newman for reading and suggesting improvements to this memo. 10. References 10.1. Normative References [RFC1242] Bradner, S., "Benchmarking terminology for network interconnection devices", RFC 1242, July 1991. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for Network Interconnect Devices", RFC 2544, March 1999. Bradner, et al. Expires September 13, 2012 [Page 7] Internet-Draft RFC 2544 AS March 2012 [RFC5180] Popoviciu, C., Hamza, A., Van de Velde, G., and D. Dugatkin, "IPv6 Benchmarking Methodology for Network Interconnect Devices", RFC 5180, May 2008. 10.2. Informative References Authors' Addresses Scott Bradner Harvard University 29 Oxford St. Cambridge, MA 02138 USA Phone: +1 617 495 3864 Fax: Email: sob@harvard.edu URI: http://www.sobco.com Kevin Dubray Juniper Networks Phone: Fax: Email: kdubray@juniper.net URI: Jim McQuaid Turnip Video 6 Cobbleridge Court Durham, North Carolina 27713 USA Phone: +1 919-619-3220 Fax: Email: jim@turnipvideo.com URI: www.turnipvideo.com Bradner, et al. Expires September 13, 2012 [Page 8] Internet-Draft RFC 2544 AS March 2012 Al Morton AT&T Labs 200 Laurel Avenue South Middletown,, NJ 07748 USA Phone: +1 732 420 1571 Fax: +1 732 368 1192 Email: acmorton@att.com URI: http://home.comcast.net/~acmacm/ Bradner, et al. Expires September 13, 2012 [Page 9]