draft-ietf-bmwg-protection-term-09

 Network Working Group                         S. Poretsky
 Internet Draft                                Allot Communications
 Expires: Jan 2011                             Rajiv Papneja
 Intended Status: Informational                Isocore
                                               J. Karthik
                                               S. Vapiwala
                                               Cisco Systems
                                               July 2010

                    Benchmarking Terminology
                   for Protection Performance
             <draft-ietf-bmwg-protection-term-09.txt >

Status of this Memo
   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.
   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on 7 Jan, 2011.

Copyright Notice
   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.  Code Components extracted from this
   document must include Simplified BSD License text as described
   in Section 4.e of the Trust Legal Provisions and are provided
   without warranty as described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s)
   controlling the copyright in such materials, this document may not
   be modified outside the IETF Standards Process, and derivative
   works of it may not be created outside the IETF Standards Process,
   except to format it for publication as an RFC or to translate it
   into languages other than English.

Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 1]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

Abstract
  This document provides common terminology and metrics for benchmarking
  the performance of sub-IP layer protection mechanisms. The performance
  benchmarks are measured at the IP-Layer with protection may be
  provided at the Sub-IP layer. The benchmarks and terminology can be
  applied in methodology documents for different sub-IP layer protection
  mechanisms such as Automatic Protection Switching (APS), Virtual Router
  Redundancy  Protocol (VRRP), Stateful High Availability (HA), and
  Multi-Protocol Label Switching Fast Reroute (MPLS-FRR).


Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 2]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance
 Table of Contents
        1. Introduction..............................................3
        2. Existing definitions......................................6
        3. Test Considerations.......................................7
           3.1. Paths................................................7
              3.1.1. Path............................................7
              3.1.2. Working Path....................................8
              3.1.3. Primary Path....................................8
              3.1.4. Protected Primary Path..........................8
              3.1.5. Backup Path.....................................9
              3.1.6. Standby Backup Path.............................10
              3.1.7. Dynamic Backup Path.............................10
              3.1.8. Disjoint Paths..................................10
              3.1.9. Point of Local repair (PLR).....................11
              3.1.10. Shared Risk Link Group (SRLG)..................11
           3.2. Protection Mechanisms................................12
              3.2.1. Link Protection.................................12
              3.2.2. Node Protection.................................12
              3.2.3. Path Protection.................................12
              3.2.4. Backup Span.....................................13
              3.2.5. Local Link Protection...........................13
              3.2.6. Redundant Node Protection.......................14
              3.2.7  State Control Interface.........................14
              3.2.8. Protected Interface.............................15
           3.3. Protection Switching.................................15
              3.3.1. Protection Switching System.....................15
              3.3.2. Failover Event..................................15
              3.3.3. Failure Detection...............................16
              3.3.4. Failover........................................17
              3.3.5. Restoration.....................................17
              3.3.6. Reversion.......................................18
           3.4. Nodes................................................18
              3.4.1. Protection-Switching Node.......................18
              3.4.2. Non-Protection Switching Node...................19
              3.4.3. Headend Node....................................19
              3.4.4. Backup Node.....................................19
              3.4.5. Merge Node......................................20
              3.4.6. Primary Node....................................20
              3.4.7. Standby Node....................................21
           3.5. Benchmarks...........................................21
              3.5.1. Failover Packet Loss............................21
              3.5.2. Reversion Packet Loss...........................22
              3.5.3. Failover Time...................................22
              3.5.4. Reversion Time..................................23
              3.5.5. Additive Backup Delay...........................23
           3.6 Failover Time Calculation Methods.....................24
              3.6.1 Time-Based Loss Method...........................24
              3.6.2 Packet-Loss Based Method.........................25
              3.6.3 Timestamp-Based Method...........................25
        4. Acknowledgments...........................................26
        5. IANA Considerations.......................................26
        6. Security Considerations...................................26
        7. References................................................26
        8. Authors' Addresses........................................27

Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 3]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

1. Introduction

   The IP network layer provides route convergence to protect data
   traffic against planned and unplanned failures in the internet. Fast
   convergence times are critical to maintain reliable network
   connectivity and performance.  Convergence Events [6] are recognized
   at the IP Layer so that Route Convergence [6] occurs.  Technologies
   that function at sub-IP layers can be enabled to provide further
   protection of IP traffic by providing the failure recovery at the
   sub-IP layers so that the outage is not observed at the IP-layer.
   Such sub-IP protection technologies include, but are not limited to,
   High Availability (HA) stateful failover, Virtual Router Redundancy
   Protocol (VRRP) [8], Automatic Link Protection (APS) for SONET/SDH,
   Resilient Packet Ring (RPR) for Ethernet, and Fast Reroute for
   Multi-Protocol Label Switching (MPLS-FRR) [9].

   1.1 Scope
   Benchmarking terminology was defined for IP-layer convergence in
   [6].  Different terminology and methodologies specific to
   benchmarking sub-IP layer protection mechanisms are required.  The
   metrics for benchmarking the performance of sub-IP protection
   mechanisms are measured at the IP layer, so that the results are
   always measured in reference to IP and independent of the specific
   protection mechanism being used. The purpose of this document is
   to provide a single terminology for benchmarking sub-IP protection
   mechanisms.

   A common terminology for Sub-IP layer protection mechanism
   benchmarking enables different implementations of a protection
   mechanism to be benchmarked and evaluated.  In addition,
   implementations of different protection mechanisms can be
   benchmarked and evaluated.  It is intended that there can exist
   unique methodology documents for each sub-IP protection mechanism
   based upon this common terminology document.  The terminology
   can be applied to methodologies that benchmark sub-IP protection
   mechanism performance with a single stream of traffic or
   multiple streams of traffic.  The traffic flow may be
   uni-directional or bi-directional as to be indicated in the
   methodology.

   1.2 General Model
   The sequence of events to benchmark the performance of Sub-IP
   Protection Mechanisms is as follows:

   1. Failover Event - Primary Path fails
   2. Failure Detection-  Failover Event is detected
   3. Failover - Backup Path becomes the Working Path due to Failover
                 Event
   4. Restoration - Primary Path recovers from a Failover Event
   5. Reversion (optional) - Primary Path becomes the Working Path

   These terms are further defined in this document.

Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 4]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

   Figures 1 through 5 show models that MAY be used when benchmarking
   Sub-IP Protection mechanisms, which MUST use a Protection Switching
   System that consists of a minimum of two Protection-Switching Nodes,
   an Ingress Node known as the Headend Node and an Egress Node known
   as the Merge Node.  The Protection Switching System MUST include
   either a Primary Path and Backup Path, as shown in Figures 1 through
   4, or a Primary Node and Standby Node, as shown in Figure 5.  A
   Protection Switching System may provide link protection, node
   protection, path protection, local link protection, and high
   availability, as shown in Figures 1 through 5 respectively.  A
   Failover Event occurs along the Primary Path or at the Primary Node.
   The Working Path is the Primary Path prior to the Failover Event and
   the Backup Path after the Failover Event.  A Tester is set outside
   the two paths or nodes as it sends and receives IP traffic along the
   Working Path.  The tester MUST record the IP packet sequence numbers,
   departure time, and arrival time so that the metrics of Failover
   Time, Additive Latency, Packet Reordering, Duplicate Packets, and
   Reversion Time can be measured.  The Tester may be a single device
   or a test system.  If Reversion is supported then the Working Path is
   the Primary Path after Restoration (Failure Recovery) of the Primary
   Path.

   Link Protection, as shown in Figure 1, provides protection when a
   Failover Event occurs on the link between two nodes along the Primary
   Path.  Node Protection, as shown in Figure 2, provides protection
   when a Failover Event occurs at a Node along the Primary Path.
   Path Protection, as shown in Figure 3, provides protection for link
   or node failures for multiple hops along the Primary Path.  Local
   Link Protection, as shown in Figure 4, provides Sub-IP Protection of
   a link between two nodes, without a Backup Node.  An example of such
   a Sub-IP Protection mechanism is SONET APS.  High Availability
   Protection, as shown in Figure 5, provides protection of a Primary
   Node with a redundant Standby Node.  State Control is provided
   between the Primary and Standby Nodes.  Failure of the Primary Node
   is detected at the Sub-IP layer to force traffic to switch to the
   Standby Node, which has state maintained for zero or minimal packet
   loss.

                      +-----------+
       +--------------|  Tester   |<-----------------------+
       |              +-----------+                        |
       | IP Traffic        | Failover           IP Traffic |
       |                   |  Event                        |
       |     ------------  |                 ----------    |
       +--->|  Ingress/  | V                | Egress/  |---+
            |Headend Node|------------------|Merge Node|  Primary
             ------------                    ----------    Path
                |                                ^
                |         ---------              |  Backup
                +--------| Backup  |-------------+   Path
                         |  Node   |
                          ---------
   Figure 1. System Under Test (SUT) for Sub-IP Link Protection

Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 5]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

                            +-----------+
       +--------------------|  Tester   |<-----------------+
       |                    +-----------+                  |
       | IP Traffic               | Failover    IP Traffic |
       |                          | Event                  |
       |                          V                        |
       |     ------------      --------      ----------    |
       +--->|  Ingress/  |    |MidPoint|    | Egress/  |---+
            |Headend Node|----|  Node  |----|Merge Node|  Primary
             ------------      --------      ----------    Path
                |                                ^
                |         ---------              |  Backup
                +--------| Backup  |-------------+   Path
                         |  Node   |
                          ---------

   Figure 2. System Under Test (SUT) for Sub-IP Node Protection

                               +-----------+
    +---------------------------|  Tester   |<----------------------+
    |                           +-----------+                       |
    | IP Traffic                      | Failover         IP Traffic |
    |                                 | Event                       |
    |                Primary Path     |                             |
    |     ------------      --------  |  --------     ----------    |
    +--->|  Ingress/  |    |MidPoint| V |Midpoint|   | Egress/  |---+
         |Headend Node|----|  Node  |---|  Node  |---|Merge Node|
          ------------      --------     --------     ----------
                |                                         ^
                |         ---------      --------         | Backup
                +--------| Backup  |----| Backup |--------+  Path
                         |  Node   |    |  Node  |
                          ---------      --------

   Figure 3. System Under Test (SUT) for Sub-IP Path Protection

                                  +-----------+
             +--------------------|  Tester   |<-------------------+
             |                    +-----------+                    |
             | IP Traffic               | Failover      IP Traffic |
             |                          | Event                    |
             |              Primary     |                          |
             |    +--------+  Path      v            +--------+    |
             |    |        |------------------------>|        |    |
             +--->| Ingress|                         | Egress |----+
                  |  Node  |- - - - - - - - - - - - >|  Node  |
                  +--------+      Backup Path        +--------+
                  |                                           |
                  |            IP-Layer Forwarding            |
                  +<----------------------------------------->+

   Figure 4. System Under Test (SUT) for Sub-IP Local Link Protection

Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 6]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance


                         +-----------+
       +-----------------|  Tester   |<--------------------+
       |                 +-----------+                     |
       | IP Traffic            | Failover       IP Traffic |
       |                       | Event                     |
       |                       V                           |
       |     ---------      --------      ----------       |
       +--->| Ingress |    |Primary |    | Egress/  |------+
            |   Node  |----|  Node  |----|Merge Node|  Primary
             ---------      --------      ----------    Path
                |        State |Control       ^
                |    Interface |(Optional)    |
                |          ---------          |
                +---------| Standby |---------+
                          |  Node   |
                           ---------

Figure 5. System Under Test (SUT) for Sub-IP Redundant Node Protection

   Some protection switching technologies may use a series of
   steps that differ from the general model. The specific differences
   SHOULD be highlighted in each technology-specific methodology.
   Note that some protection switching technologies are endowed
   with the ability to re-optimize the working path after a
   node or link failure.

2. Existing definitions
   This document uses existing terminology defined in other BMWG
   work.  Examples include, but are not limited to:

          Latency                   [Ref.[2], section 3.8]
          Frame Loss Rate           [Ref.[2], section 3.6]
          Throughput                [Ref.[2], section 3.17]
          Device Under Test (DUT)   [Ref.[3], section 3.1.1]
          System Under Test (SUT)   [Ref.[3], section 3.1.2]
          Offered Load              [Ref.[3], section 3.5.2]
          Out-of-order Packet       [Ref.[4], section 3.3.2]
          Duplicate Packet          [Ref.[4], section 3.3.3]
          Forwarding Delay          [Ref.[4], section 3.2.4]
          Jitter                    [Ref.[4], section 3.2.5]
          Packet Loss               [Ref.[6], Section 3.5]
          Packet Reordering         [Ref.[7], section 3.3]

   This document has the following frequently used acronyms:
      DUT  Device Under Test
      SUT  System Under Test

   This document adopts the definition format in Section 2 of RFC 1242
   [2].  Terms defined in this document are capitalized when used
   within this document.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, RFC 2119 [5].


Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 7]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

   RFC 2119 defines the use of these key words to help make the
   intent of standards track documents as clear as possible.  While this
   document uses these keywords, this document is not a standards track
   document.

3. Test Considerations

  3.1. Paths

    3.1.1 Path

    Definition:
       A unidirectional sequence of nodes, <R1, ..., Rn>, and links
      <L12,... L(n-1)n> with the following properties:

       a. R1 is the ingress node and forwards IP packets, which input
       into DUT/SUT, to R2 as sub-IP frames over link L12.

       b. Ri is a node which forwards data frames to R(i+1) over Link
       Li(i+1) for all i, 1<i<n-1, based on information in the sub-IP
       layer.

       c. Rn is the egress node and it outputs sub-IP frames from
       DUT/SUT as IP packets. L(n-1)n is the link between the R(n-1)
       and Rn.

    Discussion:
       The path is defined in the sub-IP layer in this document, unlike
       an IP path in RFC 2026 [1].   One path may be regarded as being
       equivalent to one IP link between two IP nodes, i.e., R1 and Rn.
       The two IP nodes may have multiple paths for protection.  A
       packet will travel on only one path between the nodes.  Packets
       belonging to a microflow [10] will traverse one or more paths.
       The path is unidirectional.  For example, the link between R1
       and R2 in the direction from R1 to R2 is L12.  For traffic
       flowing in the reverse direction from R2 to R1, the link is L21.
       Example paths are the SONET/SDH path and the label switched path
       for MPLS.

    Measurement units:
       n/a

    Issues:
       "A bidirectional path", which transmits traffic in both
       directions along the same nodes, consists of two unidirectional
       paths.  Therefore, the two unidirectional paths belonging to
       "one bidirectional path" will be treated independently when
       benchmarking for "a bidirectional path".

    See Also:
       Working Path
       Primary Path
       Backup Path


Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 8]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

    3.1.2. Working Path

    Definition:
       The path that the DUT/SUT is currently using to forward
       packets.

    Discussion:
       A Primary Path is the Working Path before occurrence of a
       Failover Event.  A Backup Path shall become the Working Path
       after a Failover Event.

    Measurement units:
       n/a

    Issues:

    See Also:
       Path
       Primary Path
       Backup Path

   3.1.3.  Primary Path

       Definition:
       The preferred point to point path for forwarding traffic
       between two or more nodes.

       Discussion:
       The Primary Path is the Path that traffic traverses
       prior to a Failover Event.

       Measurement units:
          n/a

        Issues:
          None

        See Also:
          Path
          Failover Event

    3.1.4.  Protected Primary Path

       Definition:
       A Primary Path that is protected with a Backup Path.

       Discussion:
       A Protected Primary Path must include at least one Protection
       Switching Node.


Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 9]


Internet-Draft         Benchmarking Terminology for    July 2010
                          Protection Performance

       Measurement units:
          n/a

       Issues: None

       See Also:
          Path
          Primary Path


    3.1.5.  Backup Path

       Definition:
       A path that exists to carry data traffic only if a Failover
       Event occurs on a Primary Path.

       Discussion:
       The Backup Path shall become the Working Path upon a Failover
       Event.  A Path may have one or more Backup Paths.  A Backup
       Path may protect one or more Primary Paths.  There are various
       types of Backup Paths:

          a. dedicated recovery Backup Path (1+1) or (1:1), which has
          100% redundancy for a specific ordinary path,

          b. shared Backup Path (1:N), which is dedicated to the
          protection for more than one specific Primary Path

          c. associated shared Backup Path (M:N) for which a specific
          set of Backup Paths protects a specific set of more than one
          Primary Path.

       A Backup Path may be signaled or unsignaled.  The Backup Path
       must be created prior to the Failover Event.  The backup path
       generally originates at the point of local repair (PLR), and
       terminates at a node along a primary path.

       Measurement units:
          n/a

       Issues:

       See Also:
          Path
          Working Path
          Primary Path





Poretsky, Papneja, Karthik, Vapiwala    Expires Jan 2011   [Page 10]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

     3.1.6.  Standby Backup Path

        Definition:
        A Backup Path that is established prior to a Failover Event
        to protect a Primary Path.

        Discussion:
        The Standby Backup Path and Dynamic Backup Path provide
        protection, but are established at different times.

        Measurement units: n/a

        Issues: None

        See Also:
           Backup Path
           Primary Path
           Failover Event

     3.1.7. Dynamic Backup Path

        Definition:
        A Backup Path that is established upon occurrence of a
        Failover Event.

        Discussion:
        The Standby Backup Path and Dynamic Backup Path provide
        protection, but are established at different times.

        Measurement units: n/a

        Issues: None

        See Also:
            Backup Path
            Standby Backup Path
            Failover Event

     3.1.8. Disjoint Paths

        Definition:
        A pair of paths that do not share a common link or nodes.

        Discussion:
        Two paths are disjoint if they do not share a common node or link
        other than the ingress and egress.



Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 11]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

        Measurement units: n/a

        Issues: None

        See Also:
           Path
           Primary Path
           SRLG

     3.1.9. Point of Local Repair (PLR)
         Definition:
         A node capable of Failover along the Primary Path that is
         also the ingress node for the Backup Path to protect another
         node or link.

         Discussion:
         Any node along the Primary Path from the ingress node to
         the penultimate node may be a PLR.  The PLR may use
         a single Backup Path for protecting one or more Primary
         Paths.  There can be multiple PLRs along a Primary Path.
         The PLR must be an ingress to a Backup Path.  The PLR can
         be any node along the Primary Path except the egress node
         of the Primary Path.  The PLR may simultaneously be a
         Headend Node when it is serving the role as ingress to
         the Primary Path and the Backup Path.  If the PLR is
         also the Headend Node, then the Backup Path is a Disjoint
         Path from the ingress to the Merge Node.

         Measurement units: n/a

         Issues: None

         See Also:
             Primary Path
             Backup Path
             Failover

     3.1.10. Shared Risk Link Group (SRLG)
         Definition:
         SRLG is a set of links which share the same risk (physical
         or logical) within a network.

         Discussion:
         SRLG is considered the set of links to be avoided when
         the primary and secondary paths are considered disjoint.
         The SRLG will fail as a group if the shared resource
         (physical or anything abstract such as software version)
         fails.

         Measurement units: n/a

         Issues: None

         See Also:
             Path
             Primary Path

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 12]


Internet-Draft         Benchmarking Terminology for    July 2010
                          Protection Performance

   3.2. Protection
     3.2.1. Link Protection
         Definition:
         A Backup Path that is signaled to at least one Backup Node
         to protect for failure of interfaces and links along a
         Primary Path.

         Discussion:
         Link Protection may or may not protect the entire Primary
         Path.  Link protection is shown in Figure 1.

         Measurement units: n/a

         Issues: None

         See Also:
             Primary Path
             Backup Path

     3.2.2. Node Protection
         Definition:
         A Backup Path that is signaled to at least one Backup Node
         to protect for failure of interfaces, links, and nodes
         along a Primary Path.

         Discussion:
         Node Protection may or may not protect the entire Primary
         Path.  Node Protection also provides Link Protection.
         Node Protection is shown in Figure 2.

         Measurement units: n/a

         Issues: None

         See Also:
             Link Protection

     3.2.3. Path Protection
        Definition:
        A Backup Path that is signaled to at least one Backup Node
        to provide protection along the entire Primary Path.

        Discussion:
        Path Protection provides Node Protection and Link Protection
        for every node and link along the Primary Path.  A Backup
        Path providing Path Protection may have the same ingress
        node as the Primary Path.  Path Protection is shown in
        Figure 3.

        Measurement units: n/a

        Issues: None
Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 13]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

        See Also:
             Primary Path
             Backup Path
             Node Protection
             Link protection

     3.2.4. Backup Span

        Definition:
        The number of hops used by a Backup Path.

        Discussion:
        The Backup Span is an integer obtained by counting the
        number of nodes along the Backup Path.

        Measurement units:
             number of nodes

        Issues:
             None

        See Also:
             Primary Path
             Backup Path

     3.2.5. Local Link Protection

        Definition:
        A Backup Path that is a redundant path between two nodes
        which does not use a Backup Node.

        Discussion:
        Local Link Protection must be provided as a Backup Path
        between two nodes along the Primary Path without the use
        of a Backup Node.  Local Link Protection is provided by
        Protection Switching Systems such as SONET APS.  Local
        Link Protection is shown in Figure 4.

        Measurement units: None

        Issues: None

        See Also:
        Backup Path
        Backup Node



Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 14]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

     3.2.6. Redundant Node Protection

        Definition:
        A Protection Switching System with a Primary Node
        protected by a Standby Node along the Primary Path.

        Discussion:
        Redundant Node Protection is provided by Protection
        Switching Systems such as VRRP and HA.  The protection
        mechanisms occur at Sub-IP layers to switch traffic from
        a Primary Node to Backup Node upon a Failover Event at
        the Primary Node.  Traffic continues to traverse the
        Primary Path through the Standby Node.  The failover may
        be stateful, in which the state information may be
        exchanged in-band or over an out-of-band state control
        interface.  The Standby Node may be active or passive.
        Redundant Node Protection is shown in Figure 5.

        Measurement units: None

        Issues: None

        See Also:
        Primary Path
        Primary Node
        Standby Node

     3.2.7. State Control Interface

        Definition:
        An out-of-band control interface used to exchange state
        information between the Primary Node and Standby Node.

        Discussion:
        The State Control Interface may be used for Redundant Node
        Protection. The State Control Interface should be out-of-band.
        It is possible to have Redundant Node Protection in which
        there is no state control or state control is provided
        in-band.  The State Control Interface between the Primary
        and Standby Node may be one or more hops.

        Measurement units: None

        Issues: None

        See Also:
        Primary Node
        Standby Node

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 15]


Internet-Draft         Benchmarking Terminology for    July 2010
                          Protection Performance

     3.2.8. Protected Interface

        Definition:
        An interface along the Primary Path that is protected by
        a Backup Path.

        Discussion:
        A Protected Interface is an interface protected by a
        Protection Switching System that provides Link
        Protection, Node Protection, Path Protection, Local
        Link Protection, and Redundant Node Protection.

        Measurement units: None

        Issues: None

        See Also:
           Primary Path
           Backup Path

   3.3. Protection Switching

     3.3.1.  Protection Switching System

        Definition:
          A DUT/SUT that is capable of Failure Detection and Failover
          from a Primary Path to a Backup Path or Standby Node when a
          Failover Event occurs.

        Discussion:
          The Protection Switching System must include either a
          Primary Path and Backup Path, as shown in Figures 1 through
          4, or a Primary Node and Standby Node, as shown in Figure
          5.  The Backup Path may be a Standby Backup Path or a
          dynamic Backup Path.  The Protection Switching System
          includes the mechanisms for both Failure Detection and
          Failover.

        Measurement units: n/a

        Issues: None

        See Also:
             Primary Path
             Backup Path
             Failover

     3.3.2.  Failover Event

       Definition:
       The occurrence of a planned or unplanned action in the network
       that results in a change in the Path that data traffic traverses.

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 16]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

       Discussion:
       Failover Events include, but are not limited to, link failure
       and router failure.  Routing changes are considered Convergence
       Events [6] and are not Failover Events.  This restricts
       Failover Events to sub-IP layers. Failover may be at the PLR or
       at the ingress. If the failover is at the ingress it is
       generally on a disjoint path from the ingress to egress.

       Failover Events may results from failures such as link failure
       or router failure.  The change in path after Failover may have
       a Backup Span of one or more nodes.  Failover Events are
       distinguished from routing changes and Convergence Events [6]
       by the detection of the failure and subsequent protection
       switching at a sub-IP layer.  Failover occurs at a Point of
       Local Repair (PLR) or Primary Node.

       Measurement units:
          n/a

       Issues: None

       See Also:
          Path
          Failure Detection
          Disjoint Path

     3.3.3.  Failure Detection

       Definition:
       The process to identify at a sub-IP layer a Failover Event
       at a Primary Node or along the Primary Path.

       Discussion:
       Failure Detection occurs at the Primary Node or ingress node
       of the Primary  Path.  Failure Detection occurs via a sub-IP
       mechanism such as detection of a link down event or timeout for
       receipt of a control packet. A failure may be completely
       isolated. A failure may affect a set of links which share a
       single SRLG (e.g. port with many sub-interfaces). A failure may
       affect multiple links that are not part of SRLG.

       Measurement units: n/a

       Issues:

       See Also:
         Primary Path


Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 17]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

     3.3.4.  Failover

        Definition:
        The process to switch data traffic from the protected Primary
        Path to the Backup Path upon Failure Detection of a Failover
        Event.

        Discussion:
        Failover to a Backup Path provides Link Protection, Node
        Protection, or Path Protection.  Failover is complete when
        Packet Loss [6], Out-of-order Packets [4], and Duplicate
        Packets [4] are no longer observed.  Forwarding Delay [4]
        may continue to be observed.

        Measurement units:
            n/a

        Issues:

        See Also:
             Primary Path
             Backup Path
             Failover Event

     3.3.5.  Restoration

        Definition:
        The state of failover recovery in which the Primary Path
        has recovered from a Failover Event, but is not yet
        forwarding packets because the Backup Path remains the
        Working Path.

        Discussion:
        Restoration must occur while the Backup Path is the
        Working Path.  The Backup Path is maintained as the
        Working Path during Restoration.  Restoration produces
        a Primary Path that is recovered from failure, but is
        not yet forwarding traffic.  Traffic is still being
        forwarded by the Backup Path functioning as the Working
        Path.

        Measurement units:
            n/a

        Issues:

        See Also:
            Primary Path
            Failover Event
            Failure Recovery
            Working Path
            Backup Path

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 18]


Internet-Draft         Benchmarking Terminology for    July 2010
                          Protection Performance

     3.3.6.  Reversion

        Definition:
        The state of failover recovery in which the Primary Path has
        become the Working Path so that it is forwarding packets.

        Discussion:
        Protection Switching Systems may or may not support Reversion.
        Reversion, if supported, must occur after Restoration.
        Packet forwarding on the Primary Path resulting from Reversion
        may occur either fully or partially over the Primary Path.  A
        potential problem with Reversion is the discontinuity in end to
        end delay when the Forwarding Delays [4] along the Primary Path
        and Backup Path are different, possibly causing Out of Order
        Packets [4], Duplicate Packets [4], and increased Jitter [4].

        Measurement units: n/a

        Issues: None

        See Also:
            Protection Switching System
            Working Path
            Primary Path

   3.4. Nodes

     3.4.1.  Protection-Switching Node

         Definition:
         A node that is capable of participating in a Protection
         Switching System.

         Discussion:
         The Protection Switching Node may be an ingress or egress for
         a Primary Path or Backup Path, such as used for MPLS Fast
         Reroute configurations.  The Protection Switching Node may
         provide Redundant Node Protection as a Primary Node in a
         Redundant chassis configuration with a Standby Node, such as
         used for VRRP and HA configurations.

         Measurement units:
             n/a

         Issues:

         See Also:
             Protection Switching System


Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 19]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

     3.4.2.  Non-Protection Switching Node

         Definition:
         A node that is not capable of participating in a Protection
         Switching System, but may exist along the Primary Path or
         Backup Path.

         Discussion:

         Measurement units:
             n/a

         Issues:

         See Also:
             Protection Switching System
             Primary Path
             Backup Path

     3.4.3.  Headend Node
        Definition:
        The ingress node of the Primary Path.

        Discussion:
        The Headend Node may also be a PLR when it is serving in
        the dual role as the ingress to the Backup Path.

        Measurement units: n/a

        Issues:

        See Also:
             Primary Path
             Point of Local Repair (PLR)
             Failover

     3.4.4.  Backup Node
        Definition:
        A node along the Backup Path.

        Discussion:
        The Backup Node can be any node along the Backup Path.
        There may be one or more Backup Nodes along the Backup Path.
        A Backup Node may be the ingress, mid-point, or egress of
        the Backup Path.  If the Backup Path has only one Backup
        Node, then that Backup Node is the ingress and egress of the
        Backup Path.

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 20]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

        Measurement units: n/a

        Issues:

        See Also:
             Backup Path

       3.4.5.  Merge Node
         Definition:
             A node along the Primary Path where Backup Path terminates.

         Discussion:
             The Merge Node can be any node along the Primary Path
             except the ingress node of the Primary Path.  There can be
             multiple Merge Nodes along a Primary Path.  A Merge Node
             can be the egress node for a single or multiple Backup
             Paths.  The Merge Node must be the egress to the Backup
             Path.  The Merge Node may also be the egress of the
             Primary Path or Point of Local Repair (PLR).

         Measurement units:
             n/a

         Issues:

         See Also:
             Primary Path
             Backup Path
             PLR
             Failover

     3.4.6. Primary Node

        Definition:
        A node along the Primary Path that is capable of Failover to a
        redundant Standby Node.

        Discussion:
        The Primary Node may be used for Protection Switching Systems
        that provide Redundant Node Protection, such as VRRP and HA

        Measurement units: n/a

        Issues:

        See Also:
             Protection Switching System
             Redundant Node Protection
             Standby Node

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 21]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

     3.4.7.  Standby Node

        Definition:
        A redundant node to a Primary Node that forwards traffic along
        the Primary Path upon Failure Detection of the Primary Node.

        Discussion:
        The Standby Node must be used for Protection Switching
        Systems that provide Redundant Node Protection, such as VRRP
        and HA.  The Standby Node must provide protection along the
        same Primary Path.  If the failover is to a Disjoint Path then
        it is a Backup Node.  The Standby Node may be configured
        for 1:1 or N:1 protection.

        The communication between the Primary Node and Standby Node
        may be in-band or across an out-of-band State Control
        interface.  The Standby Node may be geographically dispersed
        from the Primary Node.  When geographically dispersed, the
        number of hops of separation may increase failover time.

        The Standby Node may be passive or active.  The Passive Standby
        Node is not offered traffic and does not forward traffic until
        Failure Detection of the Primary Node.  Upon Failure Detection
        of the Primary Node, traffic offered to the Primary Node is
        instead offered to the Passive Standby Node.  The Active
        Standby Node is offered traffic and forwards traffic along the
        Primary Path while the Primary Node is also active.  Upon
        Failure Detection of the Primary Node, traffic offered to the
        Primary Node is switched to the Active Standby Node.

        Measurement units: n/a

        Issues:

        See Also:
             Primary Node
             State Control Interface

     3.5.  Benchmarks

      3.5.1.  Failover Packet Loss
        Definition:
        The amount of packet loss produced by a Failover Event until
        Failover completes, where the measurement begins when the last
        unimpaired packet is received by the Tester on the Protected
        Primary Path and ends when the first unimpaired packet is
        received by the Tester on the Backup Path.

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 22]


Internet-Draft         Benchmarking Terminology for    July 2010
                          Protection Performance

        Discussion:
        Packet loss can be observed as a reduction of forwarded
        traffic from the maximum forwarding rate.  Failover Packet
        Loss includes packets that were lost, reordered, or delayed.
        Failover Packet Loss may reach 100% of the offered load.

        Measurement units:
          Number of Packets

        Issues:  None

        See Also:
           Failover Event
           Failover

      3.5.2.   Reversion Packet Loss

        Definition:
        The amount of packet loss produced by Reversion, where the
        measurement begins when the last unimpaired packet is received
        by the Tester on the Backup Path and ends when the first
        unimpaired packet is received by the Tester on the Protected
        Primary Path .

        Discussion:
        Packet loss can be observed as a reduction of forwarded
        traffic from the maximum forwarding rate.  Reversion Packet
        Loss includes packets that were lost, reordered, or delayed.
        Reversion Packet Loss may reach 100% of the offered load.

         Measurement units: Number of Packets

         Issues:  None

         See Also:
           Reversion

      3.5.3. Failover Time

        Definition:
         The amount of time it takes for Failover to successfully
         complete.

        Discussion:
        Failover Time can be calculated using the Time-Based Loss
        Method (TBLM), Packet-Loss Based Method (PLBM), or
        Timestamp-Based Method (TBM).  It is RECOMMENDED that the
        TBM is used.

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 23]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

        Measurement units:
           milliseconds

        Issues: None

        See Also:
           Failover
           Failover Time
           Time-Based Loss Method (TBLM)
           Packet-Loss Based Method (PLBM)
           Timestamp-Based Method (TBM)

      3.5.4.  Reversion Time

        Definition:
        The amount of time it takes for Reversion to complete so
        that the Primary Path is restored as the Working Path.

        Discussion:
        Reversion Time can be calculated using the Time-Based Loss
        Method (TBLM), Packet-Loss Based Method (PLBM), or
        Timestamp-Based Method (TBM).  It is RECOMMENDED that the
        TBM is used.

        Measurement units:
           milliseconds

        Issues: None

        See Also:
           Reversion
           Primary Path
           Working Path
           Reversion Packet Loss
           Time-Based Loss Method (TBLM)
           Packet-Loss Based Method (PLBM)
           Timestamp-Based Method (TBM)

      3.5.5.  Additive Backup Delay

        Definition:
        The amount of increased Forwarding Delay [4] resulting
        from data traffic traversing the Backup Path instead of
        the Primary Path.

        Discussion:
        Additive Backup Delay is calculated using Equation 1 as
        shown below:

        (Equation 1)
        Additive Backup Delay =
                  Forwarding Delay(Backup Path) -
                  Forwarding Delay(Primary Path).

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 24]


Internet-Draft         Benchmarking Terminology for    July 2010
                          Protection Performance

        Measurement units:
           milliseconds

        Issues:
        Additive Backup Latency may be a negative result.
        This is theoretically possible, but could be indicative
        of a sub-optimum network configuration .

        See Also:
           Primary Path
           Backup Path
           Primary Path Latency
           Backup Path Latency

     3.6 Failover Time Calculation Methods
        The following Methods may be assessed on a per-flow basis using
        at least 16 flows spread over the routing table (more flows is
        better). Otherwise, the impact of a prefix-dependency in the
        implementation of a particular protection technology could be
        missed. However, the test designer must be aware of the number
        of packets per second sent to each prefix, as this establishes
        sampling of the path and the time resolution for measurement
        of Failover time on a per-flow basis.

     3.6.1 Time-Based Loss Method (TBLM)

      Definition:
      The method to calculate Failover Time (or Reversion Time) using a
      time scale on the Tester to measure the interval of Failover
      Packet Loss.

      Discussion:
      The Tester must provide statistics which show the duration of
      failure on a time scale based on occurrence of packet loss on
      a time scale.  This is indicated by the duration of non-zero
      packet loss.  The TBLM includes failure detection time and
      time for data traffic to begin traversing the Backup Path.
      Failover Time and Reversion Time are calculated using the
      TBLM as shown in Equation 2:

      (Equation 2)
          (Equation 2a)
          TBLM Failover Time = Time(Failover) - Time(Failover Event)

          (Equation 2b)
          TBLM Reversion Time = Time(Reversion) - Time(Restoration)

      Where as Time(Failover)= Time on the tester at the receipt of the
      first unimpaired packet at egress node after the backup path
      became the working path

      Time(Failover Event)= Time on the tester at the receipt of the
      last unimpaired packet at egress node on the primary path
      before failure

      Measurement units:
         milliseconds

      Issues:
         None

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 25]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance
      See Also:
         Failover
         Packet-Loss Based Method

    3.6.2 Packet-Loss Based Method (PLBM)

      Definition:
      The method used to calculate Failover Time (or Reversion Time)
      from the amount of Failover Packet Loss.

      Discussion:
      PLBM includes failure detection time and time for data traffic to
      begin traversing the Backup Path.  Failover Time can be
      calculated using PLBM from the amount Failover Packet Loss as
      shown below in Equation 3. Note: If traffic is sent to more than 1
      destination, PLBM gives the average loss over the measured
      destinations

      (Equation 3)
           (Equation 3a)
           PLBM Failover Time =
              (Number of packets lost /
                       Offered Load rate) * 1000)

           (Equation 3b)
           PLBM Restoration Time =
              (Number of packets lost /
                       Offered Load rate) * 1000)

           Units are packets/(packets/second) = seconds

      Measurement units:
         milliseconds

      Issues:
         None

      See Also:
         Failover
         Time-Based Loss Method

     3.6.3 Timestamp-Based Method (TBM)

      Definition:
      The method to calculate Failover Time (or Reversion Time)
      using a time scale to quantify the interval between
      unimpaired packets arriving in the test stream.

      Discussion:
      The purpose of this method is to quantify the duration of
      failure or reversion on a time scale based on the
      observation of unimpaired packets.  The TBM is calculated
      from Equation 2 with the values obtained from the timestamp
      in the packet payload, rather than from the Tester clock as
      is used for the values when using the TBLM.

      Unimpaired packets are normal packets that are not lost,
      reordered, or duplicated.  A reordered packet is defined in

Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 26]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

      [10, section 3.3].  A duplicate packet is defined in
      [4, section 3.3.3].  A lost packet is defined in
      [7, Section 3.5].  Unimpaired packets may be detected by checking
      a sequence number in the payload, where the sequence number equals
      the next expected number for an unimpaired packet.  A sequence gap
      or sequence reversal indicates impaired packets.

      For calculating Failover Time, the TBM includes failure
      detection time and time for data traffic to begin traversing the
      Backup Path.  For calculating Reversion Time, the TBM includes
      Reversion Time and time for data traffic to begin traversing the
      Primary Path.

      Measurement units:
         milliseconds

      Issues: None

      See Also:
         Failover
         Failover Time
         Reversion
         Reversion Time

4. Acknowledgements
     We would like thank the BMWG and particularly Al Morton and Curtis
     Villamizar for their reviews, comments, and contributions to this
     work.

5. IANA Considerations
     This document requires no IANA considerations.

6. Security Considerations

   Benchmarking activities as described in this memo are limited to
   technology characterization using controlled stimuli in a laboratory
   environment, with dedicated address space and the constraints
   specified in the sections above.

   The benchmarking network topology will be an independent test setup
   and MUST NOT be connected to devices that may forward the test
   traffic into a production network, or misroute traffic to the test
   management network.

   Further, benchmarking is performed on a "black-box" basis, relying
   solely on measurements observable external to the DUT/SUT.

   Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
   benchmarking purposes.  Any implications for network security arising
   from the DUT/SUT SHOULD be identical in the lab and in production
   networks.



Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 27]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

7. References
7.1. Normative References
     [1] Bradner, S., "The Internet Standards Process -- Revision 3",
         RFC 2026, October 1996.

     [2] Bradner, S., Editor, "Benchmarking Terminology for
         Network Interconnection Devices", RFC 1242, July 1991.

     [3] Mandeville, R., "Benchmarking Terminology for LAN
         Switching Devices", RFC 2285, February 1998.
     [4] Poretsky, S., et al., "Terminology for Benchmarking
         Network-layer Traffic Control Mechanisms", RFC 4689,
         November 2006.

     [5] Bradner, S., "Key words for use in RFCs to Indicate
         Requirement Levels", RFC 2119, July 1997.

     [6] Poretsky, S., Imhoff, B., "Benchmarking Terminology for IGP
         Convergence", draft-ietf-bmwg-igp-dataplane-conv-term-21,
         work in progress, May 2010.

     [7] Morton, A., et al, "Packet Reordering Metrics", RFC 4737,
          November 2006.

     [8] Hinden, R., "Virtual Router Redundancy Protocol", RFC 5798,
          March 2010.

7.2. Informative References

     [9] Pan., P. et al, "Fast Reroute Extensions to RSVP-TE for LSP
         Paths", RFC 4090, May 2005.

     [10] Nichols, K., et al, "Definition of the Differentiated
         Services Field (DS Field) in the IPv4 and IPv6 Headers",
         RFC 2474, December 1998.

8.  Authors' Addresses

   Scott Poretsky
   Allot Communications
   67 South Bedford Street, Suite 400
   Burlington, MA 01803
   USA
   Phone: + 1 508 309 2179
   Email: sporetsky@allot.com


Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011   [Page 28]


Internet-Draft         Benchmarking Terminology for     July 2010
                          Protection Performance

   Rajiv Papneja
   Isocore
   12359 Sunrise Valley Drive
   Reston, VA 22102
   USA
   Phone: +1 703 860 9273
   Email: rpapneja@isocore.com

   Jay Karthik
   Cisco Systems
   300 Beaver Brook Road
   Boxborough, MA 01719
   USA
   Phone: +1 978 936 0533
   Email: jkarthik@cisco.com

   Samir Vapiwala
   Cisco System
   300 Beaver Brook Road
   Boxborough, MA 01719
   USA
   Phone: +1 978 936 1484
   Email: svapiwal@cisco.com























Poretsky, Papneja, Karthik, Vapiwala   Expires Jan 2011  [Page 29]

Document	Document type	This is an older version of an Internet-Draft that was ultimately published as RFC 6414.
	Select version	00 01 02 03 04 05 06 07 08 09 RFC 6414
	Compare versions
	Authors	Jay Karthik , Samir Vapiwala , Rajiv Papneja , Scott Poretsky Email authors
	RFC stream
	Intended RFC status	Informational
	Other formats	txt pdf bibtex bibxml
	Additional resources	Mailing list discussion