Network Working Group                                         Yiqun. Cai
Internet-Draft                                                 Heidi. Ou
Intended status: Standards Track                           Alibaba Group
Expires: December 21, 2018                               Sri. Vallepalli
                                                       Mankamana. Mishra
                                                            Stig. Venaas
                                                           Cisco Systems
                                                             Andy. Green
                                                         British Telecom
                                                           June 19, 2018


                  PIM Designated Router Load Balancing
                         draft-ietf-pim-drlb-08

Abstract

   On a multi-access network, one of the PIM routers is elected as a
   Designated Router (DR).  On the last hop LAN, the PIM DR is
   responsible for tracking local multicast listeners and forwarding
   traffic to these listeners if the group is operating in PIM-SM.  In
   this document, we propose a modification to the PIM-SM protocol that
   allows more than one of these last hop routers to be selected so that
   the forwarding load can be distributed among these routers.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 21, 2018.

Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.





Cai, et al.             Expires December 21, 2018               [Page 1]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
   3.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .   5
   4.  Functional Overview . . . . . . . . . . . . . . . . . . . . .   6
     4.1.  GDR Candidates  . . . . . . . . . . . . . . . . . . . . .   6
     4.2.  Hash Mask and Hash Algorithm  . . . . . . . . . . . . . .   7
     4.3.  Modulo Hash Algorithm . . . . . . . . . . . . . . . . . .   8
     4.4.  PIM Hello Options . . . . . . . . . . . . . . . . . . . .   9
   5.  Hello Option Formats  . . . . . . . . . . . . . . . . . . . .   9
     5.1.  PIM DR Load Balancing Capability (DRLBC) Hello Option . .   9
     5.2.  PIM DR Load Balancing GDR (DRLBGDR) Hello Option  . . . .  10
   6.  Protocol Specification  . . . . . . . . . . . . . . . . . . .  11
     6.1.  PIM DR Operation  . . . . . . . . . . . . . . . . . . . .  11
     6.2.  PIM GDR Candidate Operation . . . . . . . . . . . . . . .  12
       6.2.1.  Router Receives New DRLBGDR . . . . . . . . . . . . .  13
       6.2.2.  Router Receives Updated DRLBGDR . . . . . . . . . . .  13
     6.3.  PIM Assert Modification . . . . . . . . . . . . . . . . .  14
   7.  Compatibility . . . . . . . . . . . . . . . . . . . . . . . .  15
   8.  Manageability Considerations  . . . . . . . . . . . . . . . .  15
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
   10. Security Considerations . . . . . . . . . . . . . . . . . . .  16
   11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .  16
   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     12.1.  Normative References . . . . . . . . . . . . . . . . . .  16
     12.2.  Informative References . . . . . . . . . . . . . . . . .  17
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

1.  Introduction

   On a multi-access LAN such as an Ethernet, one of the PIM routers is
   elected as a DR.  The PIM DR has two roles in the PIM-SM protocol.
   On the first hop network, the PIM DR is responsible for registering
   an active source with the Rendezvous Point (RP) if the group is
   operating in PIM-SM.  On the last hop LAN, the PIM DR is responsible
   for tracking local multicast listeners and forwarding to these
   listeners if the group is operating in PIM-SM.



Cai, et al.             Expires December 21, 2018               [Page 2]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   Consider the following last hop LAN in Figure 1:

                            ( core networks )
                              |     |     |
                              |     |     |
                             R1    R2     R3
                              |     |     |
                           --(last hop LAN)--
                                    |
                                    |
                            (many receivers)

                       Figure 1: Last Hop LAN

   Assume R1 is elected as the Designated Router.  According to
   [RFC4601], R1 will be responsible for forwarding traffic to that LAN
   on behalf of any local members.  In addition to keeping track of IGMP
   and MLD membership reports, R1 is also responsible for initiating the
   creation of source and/or shared trees towards the senders or the
   RPs.

   Forcing sole data plane forwarding responsibility on the PIM DR
   uncovers a limitation in the protocol.  In comparison, even though an
   OSPF DR or an IS-IS DIS handles additional duties while running the
   OSPF or IS-IS protocols, they are not required to be solely
   responsible for forwarding packets for the network.  On the other
   hand, on a last hop LAN, only the PIM DR is asked to forward packets
   while the other routers handle only control traffic (and perhaps drop
   packets due to RPF failures).  Hence the forwarding load of a last
   hop LAN is concentrated on a single router.

   This leads to several issues.  One of the issues is that the
   aggregated bandwidth will be limited to what R1 can handle towards
   this particular interface.  It is very common that the last hop LAN
   usually consists of switches that run IGMP/MLD or PIM snooping.  This
   allows the forwarding of multicast packets to be restricted only to
   segments leading to receivers who have indicated their interest in
   multicast groups using either IGMP or MLD.  The emergence of the
   switched Ethernet allows the aggregated bandwidth to exceed,
   sometimes by a large number, that of a single link.  For example, let
   us modify Figure 1 and introduce an Ethernet switch in Figure 2.










Cai, et al.             Expires December 21, 2018               [Page 3]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


                           ( core networks )
                             |     |     |
                             |     |     |
                            R1    R2     R3
                             |     |     |
                          +=gi0===gi1===gi2=+
                          +                 +
                          +      switch     +
                          +                 +
                          +=gi4===gi5===gi6=+
                             |     |     |
                            H1    H2     H3


               Figure 2: Last Hop Network with Ethernet Switch




   Let us assume that each individual link is a Gigabit Ethernet.  Each
   router, R1, R2 and R3, and the switch have enough forwarding capacity
   to handle hundreds of Gigabits of data.

   Let us further assume that each of the hosts requests 500 Mbps of
   unique multicast data.  This totals to 1.5 Gbps of data, which is
   less than what each switch or the combined uplink bandwidth across
   the routers can handle, even under failure of a single router.

   On the other hand, the link between R1 and switch, via port gi0, can
   only handle a throughput of 1Gbps.  And if R1 is the only DR (the PIM
   DR elected using the procedure defined by [RFC4601]) at least 500
   Mbps worth of data will be lost because the only link that can be
   used to draw the traffic from the routers to the switch is via gi0.
   In other words, the entire network's throughput is limited by the
   single connection between the PIM DR and the switch (or the last hop
   LAN as in Figure 1).

   The problem may also manifest itself in a different way.  For
   example, R1 happens to forward 500 Mbps worth of unicast data to H1,
   and at the same time, H2 and H3 each request 300 Mbps of different
   multicast data.  R1 experiences packet drop once again. while, in the
   meantime, there is sufficient forwarding capacity left on R2 and R3
   and unused link capacity between the switch and R2/R3.

   Another important issue is related to failover.  If R1 is the only
   forwarder on the last hop router for shared LAN, when R1 goes out of
   service, multicast forwarding for the entire LAN has to be rebuilt by
   the newly elected PIM DR.  However, if there was a way that allowed



Cai, et al.             Expires December 21, 2018               [Page 4]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   multiple routers to forward to the LAN for different groups, failure
   of one of the routers would only lead to disruption to a subset of
   the flows, therefore improving the overall resilience of the network.

   There is limitation in the hash algorithm used in this document, but
   this draft provides the option to have different and more consistent
   hash algorithms in the future.

   In this document, we propose a modification to the PIM-SM protocol
   that allows more than one of these routers, called Group Designated
   Routers (GDR) to be selected so that the forwarding load can be
   distributed among a number of routers.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   With respect to PIM, this document follows the terminology that has
   been defined in [RFC4601].

   This document also introduces the following new acronyms:

   o  GDR: GDR stands for "Group Designated Router".  For each multicast
      flow, either a (*,G) for ASM, or an (S,G) for SSM, a hash
      algorithm (described below) is used to select one of the routers
      as a GDR.  The GDR is responsible for initiating the forwarding
      tree building process for the corresponding multicast flow.

   o  GDR Candidate: a last hop router that has the potential to become
      a GDR.  A GDR Candidate must have the same DR priority and must
      run the same GDR election hash algorithm as the DR router.  It
      must send and process new PIM Hello Options as defined in this
      document.  There might be more than one GDR Candidate on a LAN,
      but only one can become GDR for a specific multicast flow.

3.  Applicability

   The proposed change described in this specification applies to PIM-SM
   last hop routers only.

   It does not alter the behavior of a PIM DR on the first hop network.
   This is because the source tree is built using the IP address of the
   sender, not the IP address of the PIM DR that sends the registers
   towards the RP.  The load balancing between first hop routers can be
   achieved naturally if an IGP provides equal cost multiple paths
   (which it usually does in practice).  Also distributing the load to



Cai, et al.             Expires December 21, 2018               [Page 5]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   do registering does not justify the additional complexity required to
   support it.

4.  Functional Overview

   In the existing PIM DR election, when multiple last hop routers are
   connected to a multi-access LAN (for example, an Ethernet), one of
   them is selected to act as PIM DR.  The PIM DR is responsible for
   sending local Join/Prune messages towards the RP or source.  In order
   to elect the PIM DR, each PIM router on the LAN examines the received
   PIM Hello messages and compares its DR priority and IP address with
   those of its neighbors.  The router with the highest DR priority is
   the PIM DR.  If there are multiple such routers, their IP addresses
   are used as the tie-breaker, as described in [RFC4601].

   In order to share forwarding load among last hop routers, besides the
   normal PIM DR election, the GDR is also elected on the last hop
   multi-access LAN.  There is only one PIM DR on the multi-access LAN,
   but there might be multiple GDR Candidates.

   For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a
   hash algorithm is used to select one of the routers to be the GDR.  A
   new DR Load Balancing Capability (DRLBC) PIM Hello Option, which
   contains hash algorithm type, is announced by routers on interfaces
   where this specification is enabled.  Last hop routers with the new
   DRLBC Option advertised in its Hello, and using the same GDR election
   hash algorithm and the same DR priority as the PIM DR, are considered
   as GDR Candidates.

   Hash Masks are defined for Source, Group and RP separately, in order
   to handle PIM ASM/SSM.  The masks, as well as a sorted list of GDR
   Candidates' Addresses, are announced by DR in a new DR Load Balancing
   GDR (DRLBGDR) PIM Hello Option.

   A hash algorithm based on the announced Source, Group, or RP masks
   allows one GDR to be assigned to a corresponding multicast state.
   And that GDR is responsible for initiating the creation of the
   multicast forwarding tree for multicast traffic.

4.1.  GDR Candidates

   GDR is the new concept introduced by this specification.  GDR
   Candidates are routers eligible for GDR election on the LAN.  To
   become a GDR Candidate, a router MUST support this specification,
   have the same DR priority and run the same GDR election hash
   algorithm as the DR on the LAN.





Cai, et al.             Expires December 21, 2018               [Page 6]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   For example, assume there are 4 routers on the LAN: R1, R2, R3 and
   R4, which all support this specification.  R1, R2 and R3 have the
   same DR priority while R4's DR priority is less preferred.  In this
   example, R4 will not be eligible for GDR election, because R4 will
   not become a PIM DR unless all of R1, R2 and R3 go out of service.

   Furthermore, assume router R1 wins the PIM DR election, R1 and R2 run
   the same hash algorithm for GDR election, while R3 runs a different
   one.  In this case, only R1 and R2 will be eligible for GDR election,
   while R3 will not.

   As a DR, R1 will include its own Load Balancing Hash Masks and the
   identity of R1 and R2 (the GDR Candidates) in its DRLBGDR Hello
   Option.

4.2.  Hash Mask and Hash Algorithm

   A Hash Mask is used to extract a number of bits from the
   corresponding IP address field (32 for v4, 128 for v6) and calculate
   a hash value.  A hash value is used to select a GDR from GDR
   Candidates advertised by PIM DR.  For example, 0.0.255.0 defines a
   Hash Mask for an IPv4 address that masks the first, the second, and
   the fourth octets.

   There are three Hash Masks defined,

   o  RP Hash Mask

   o  Source Hash Mask

   o  Group Hash Mask

   The hash masks need to be configured on the PIM routers that can
   potentially become a PIM DR, unless the implementation provides
   default Hash Mask.  An implementation SHOULD provide masks with
   default values 255.255.255.255 (IPv4) and
   FFFF:FFFF:FFFF:FFFF:FFFFF:FFFF:FFFF:FFFF (IPv6).

   o  If the group is ASM and the RP Hash Mask announced by the PIM DR
      is not 0, calculate the value of hashvalue_RP [Section 4.3] to
      determine GDR.

   o  If the group is ASM and the RP Hash Mask announced by the PIM DR
      is 0, obtain the value of hashvalue_Group [Section 4.3 ] to
      determine GDR.

   o  If the group is SSM, use hashvalue_SG [Section 4.3] to determine
      GDR.



Cai, et al.             Expires December 21, 2018               [Page 7]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   A simple Modulo hash algorithm will be discussed in this document.
   However, to allow another hash algorithms to be used, a 4-bytes "Hash
   Algorithm Type" field is included in DRLBC Hello Option to specify
   the hash algorithm used by a last hop router.

   If different hash algorithm types are advertised among last hop
   routers, only last hop routers running the same hash algorithm as the
   DR (and having the same DR priority as the DR) are eligible for GDR
   election.

4.3.  Modulo Hash Algorithm

   Modulo hash algorithm is discussed here with a detailed description
   on hashvalue_RP.  The same algorithm is described in brief for
   hashvalue_Group using the group address instead of the RP address for
   an ASM group with RP_hashmask==0, and also with hashvalue_SG for a
   the source address of an (S,G), instead of the RP address,

   o  For ASM groups, with a non-zero RP_Hash Mask, hash value is
      calculated as:

         hashvalue_RP = (((RP_address & RP_hashmask) >> N) & 0xFFFF) % M

         RP_address is the address of the RP defined for the group.  N
         is the number of zeros, counted from the least significant bit
         of the RP_hashmask.  M is the number of GDR Candidates.

         For example, Router X with IPv4 address 203.0.113.1 receives a
         DRLBGDR Hello Option from the DR, which announces RP Hash Mask
         0.0.255.0 and a list of GDR Candidates, sorted by IP addresses
         from high to low: 203.0.113.3, 203.0.113.2 and 203.0.113.1.
         The ordinal number assigned to those addresses would be:

         0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router
         X)

         Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2
         198.51.100.2 for Group2.  Following the modulo hash algorithm:

         N is 8 for 0.0.255.0, and M is 3 for the total number of GDR
         Candidates.  The hashvalue_RP for RP1 192.0.2.1 is:

         (((192.0.2.1 & 0.0.255.0) >> 8) & 0xFFFF % 3) = 2 % 3 = 2

         matches the ordinal number assigned to Router X.  Router X will
         be the GDR for Group1, which uses 192.0.2.1 as the RP.

         The hashvalue_RP for RP2 198.51.100.2 is:



Cai, et al.             Expires December 21, 2018               [Page 8]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


         (((198.51.100.2 & 0.0.255.0) >> 8) & 0xFFFF % 3) = 100 % 3 = 1

         which is different from Router X's ordinal number(2) hence,
         Router X will not be GDR for Group2.

   o  If RP_hashmask is 0, a hash value for ASM group is calculated
      using the group Hash Mask:

         hashvalue_Group = (((Group_address & Group_hashmask) >> N) &
         0xFFFF) % M

         Compare hashvalue_Group with Ordinal number assigned to Router
         X, to decide if Router X is the GDR.

   o  For SSM groups, a hash value is calculated using both the source
      and group Hash Mask:

         hashvalue_SG = ((((Source_address & Source_hashmask) >> N_S) &
         0xFFFF) ^ (((Group_address & Group_hashmask) >> N_G) & 0xFFFF))
         % M

4.4.  PIM Hello Options

   When a last hop PIM router sends a PIM Hello from an interface with
   this specification enabled, it includes a new option, called "Load
   Balancing Capability (DRLBC)".

   Besides this DRLBC Hello Option, the elected PIM DR also includes a
   new "DR Load Balancing GDR (DRLBGDR) Hello Option".  The DRLBGDR
   Hello Option consists of three Hash Masks as defined above and also
   the sorted list of all GDR Candidates' Address on the last hop LAN.

   The elected PIM DR uses DRLBC Hello Option advertised by all routers
   on the last hop LAN to compose its DRLBGDR.  The GDR Candidates use
   DRLBGDR Hello Option advertised by PIM DR to calculate hash value.

5.  Hello Option Formats

5.1.  PIM DR Load Balancing Capability (DRLBC) Hello Option












Cai, et al.             Expires December 21, 2018               [Page 9]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           Type = TBD          |         Length = 4            |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                     Hash Algorithm Type                       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


          Figure 3: Capability Hello Option

      Type: TBD.

      Length: 4 octets

      Hash Algorithm Type: 0 for Modulo hash algorithm

   This DRLBC Hello Option SHOULD be advertised by last hop routers from
   interfaces with this specification enabled.

5.2.  PIM DR Load Balancing GDR (DRLBGDR) Hello Option

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           Type = TBD          |         Length                |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                            Group Mask                         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                            Source Mask                        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                            RP Mask                            |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                    GDR Candidate Address(es)                  |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


          Figure 4: GDR Hello Option


      Type: TBD

      Length: 3 x (4 byte or 16 byte) + n x (4 byte or 16 byte) where n
      is the number of GDR candidates.

      Group Mask (32/128 bits): Mask

      Source Mask (32/128 bits): Mask



Cai, et al.             Expires December 21, 2018              [Page 10]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


      RP Mask (32/128 bits): Mask



         All masks MUST be in the same address family as the Hello IP
         header.

      GDR Address (32/128 bits): Address(es) of GDR Candidate(s)

         All addresses must be in the same address family as the Hello
         IP header.  The addresses are sorted in descending order.  The
         order is converted to the ordinal number associated with each
         GDR candidate in hash value calculation.  For example,
         addresses advertised are R3, R2, R1, the ordinal number
         assigned to R3 is 0, to R2 is 1 and to R1 is 2.

         If "Interface ID" option, as described in [RFC6395], presents
         in a GDR Candidate's PIM Hello message, and the "Router ID"
         portion is non-zero,



         +  For IPv4, the "GDR Candidate Address" will be set directly
            to "Router ID".

         +  For IPv6, the "GDR Candidate Address" will be set to the
            IPv4-IPv6 translated address of "Router ID", as described in
            [RFC4291] , that is the "Router-ID" is appended to the
            prefix of 96-bits zeros.

         If the "Interface ID" option is not present in a GDR
         Candidate's PIM Hello message, or if the "Interface ID" option
         is present but the "Router ID" field is zero, the "GDR
         Candidate Address" will be the IPv4 or IPv6 source address from
         PIM Hello message.

         This DRLBGDR Hello Option MUST only be advertised by the
         elected PIM DR.

6.  Protocol Specification

6.1.  PIM DR Operation

   The DR election process is still the same as defined in [RFC4601].  A
   DR that has this specification enabled on the interface advertises
   the new DRLBGDR Hello Option, which contains value of masks from user
   configuration, followed by a sorted list of all GDR Candidates'
   Addresses, from the highest value to the lowest value.  Moreover,



Cai, et al.             Expires December 21, 2018              [Page 11]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   same as non-DR routers, DR also advertises DRLBC Hello Option to
   indicate its capability of supporting this specification and the type
   of its GDR election hash algorithm.

   If a PIM DR receives a PIM Hello with DRLBGDR Option, the PIM DR
   SHOULD ignore the TLV.

   If a PIM DR receives a neighbor DRLBC Hello Option, which contains
   the same hash algorithm type as the DR, and the neighbor has the same
   DR priority as the DR, PIM DR SHOULD consider the neighbor as a GDR
   Candidate and insert the GDR Candidate's Address into the sorted list
   of DRLBGDR Option.

6.2.  PIM GDR Candidate Operation

   When an IGMP/MLD join is received, without this specification, only
   PIM DR will handle the join and potentially run into the issues
   described earlier.  Using this specification, a hash algorithm is
   used on GDR Candidate to determine which router is going to be
   responsible for building forwarding trees on behalf of the host.

   If a router supports this specification then each of the interfaces
   where multicast protocol is enabled, it MUST advertise DRLBC Hello
   Option in its PIM Hello.  Though DRLBC option in PIM hello does not
   guarantee that this router would be considered as a GDR candidate.
   For example, this router may have lower priority configured on shared
   LAN compare to other PIM routers.  Once DR election is done, DRLBGDR
   Hello option would be received from the current PIM DR on the link
   which would contain list of GDR.

   A GDR Candidate may receive a DRLBGDR Hello Option from PIM DR with
   different Hash Masks from those configured on it.  The GDR Candidate
   must use the Hash Masks advertised by the PIM DR to calculate the
   hash value.

   A GDR Candidate may receive a DRLBGDR Hello Option from a PIM router
   which is not DR.  The GDR Candidate MUST ignore such DRLBGDR Hello
   Option.

   A GDR Candidate may receive a Hello from the elected PIM DR, and the
   PIM DR does not support this specification.  The GDR election
   described by this specification will not take place, that is only the
   PIM DR joins the multicast tree.

   A router only acts as GDR if it is included in the GDR list of
   DRLBGDR Hello Option





Cai, et al.             Expires December 21, 2018              [Page 12]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


6.2.1.  Router Receives New DRLBGDR

   When a router receives a new DRLBGDR from the current PIM DR, it need
   to process and check if router is in list of of GDR

   1.  If a router is not listed as a GDR candidate in DRLBGDR, no
       action is needed.

   2.  If a router is listed as a GDR candidate in DRLBGDR, then it need
       to process each of the groups in the IGMP/MLD reports.  The masks
       are announced in the PIM Hello by DR as DRLBGDR Hello option.
       For each of groups in the reports it (PIM Router) needs to run
       hash algorithm (described in section 4.3) based on the announced
       Source, Group or RP masks to determine if it is GDR for specified
       group.  If the hash result is to be the GDR for the multicast
       flow, it does build the multicast forwarding tree.  If it is not
       the GDR for the multicast flow, no action is needed.

6.2.2.  Router Receives Updated DRLBGDR

   If a router (GDR or non GDR) receives an unchanged DRLBGDR from the
   current PIM DR, no action is needed.

   If a router (GDR or non GDR) receives a new or modified DRLBGDR from
   the current PIM DR.  It requires processing as described below:

   1.  If it was GDR and still included in current GDR list: it needs to
       process each of the groups and run the hash algorithm to check if
       it is still the GDR for the given group.

          If it was the GDR for group G and the new hash result chose it
          as the GDR, then no processing is required.

          If it was the GDR for a group earlier and now it is no longer
          the GDR, then it sets its assert metric for the multicast flow
          to be (PIM_ASSERT_INFINITY - 1), as explained in Sec 6.3

          If it was not the GDR for a group earlier, than even the new
          hash does not make it GDR.  For the multicast group no
          processing is required.

          If it was not the GDR for an earlier group and now becomes the
          GDR, it starts building multicast forwarding tree for this
          flow.

   2.  If it was not the GDR , and updated DRLBGDR from current PIM DR
       contains this router as one of the GDR.  In this case this router




Cai, et al.             Expires December 21, 2018              [Page 13]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


       being new GDR candidate MUST run hash algorithm for each of the
       groups (multicast flows) and for given group,

          If it is not the GDR, no processing is required.

          If it is hashed as the GDR , it needs to build multicast
          forwarding tree.

6.3.  PIM Assert Modification

   It is possible that the identity of the GDR might change in the
   middle of an active flow.  Examples this could happen include:

      When a new PIM router comes up

      When a GDR restarts

   When the GDR changes, existing traffic might be disrupted.
   Duplicates or packet losses might be observed.  To illustrate the
   case, consider the following scenario where there are two streams G1
   and G2.  R1 is the GDR for G1, and R2 is the GDR for G2.  When R3
   comes up online, it is possible that R3 becomes GDR for both G1 and
   G2, hence R3 starts to build the forwarding tree for G1 and G2.  If
   R1 and R2 stop forwarding before R3 completes the process, packet
   loss might occur.  On the other hand, if R1 and R2 continue
   forwarding while R3 is building the forwarding trees, duplicates
   might occur.

   This is not a typical deployment scenario but might still happen.
   Here we describe a mechanism to minimize the impact.  We essentially
   want to minimize packet loss.  Therefore, we would allow a small
   amount of duplicates and depend on PIM Assert to minimize the
   duplication.

   When the role of GDR changes as above, instead of immediately
   stopping forwarding, R1 and R2 continue forwarding to G1 and G2
   respectively, while, at the same time, R3 build forwarding trees for
   G1 and G2.  This will lead to PIM Asserts.

   With the introduction of GDR, the following modification to the
   Assert packet MUST be done: if a router enables this specification on
   its downstream interface, but it is not a GDR (before network event
   it was GDR), it would adjust its Assert metric to
   (PIM_ASSERT_INFINITY - 1).

   Using the above example, for G1, assume R1 and R3 agree on the new
   GDR, which is R3.  R1 will set its Assert metric as




Cai, et al.             Expires December 21, 2018              [Page 14]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   (PIM_ASSERT_INFINITY - 1).  That will make R3, which has normal
   metric in its Assert as the Assert winner.

   For G2, assume it takes a slightly longer time for R2 to find out
   that R3 is the new GDR and still considers itself being the GDR while
   R3 already has assumed the role of GDR.  Since both R2 and R3 think
   they are GDRs, they further compare the metric and IP address.  If R3
   has the better routing metric, or the same metric but a better tie-
   breaker, the result will be consistent during GDR selection.  If
   unfortunately, R2 has the better metric or the same metric but a
   better tie-breaker, R2 will become the Assert winner and continues to
   forward traffic.  This will continue until:

   The next PIM Hello option from DR selects R3 as the GDR.  R3 will
   then build the forwarding tree and send an Assert.

   The process continues until R2 agrees to the selection of R3 as the
   GDR, and set its own Assert metric to (PIM_ASSERT_INFINITY - 1),
   which will make R3 the Assert winner.  During the process, we will
   see intermittent duplication of traffic but packet loss will be
   minimized.  In the unlikely case that R2 never relinquishes its role
   as GDR (while every other router thinks otherwise), the proposed
   mechanism also helps to keep the duplication to a minimum until
   manual intervention takes place to remedy the situation.

7.  Compatibility

   In case of the hybrid Ethernet shared LAN ( where some PIM router
   enables specification defined in this draft and some do not enable)

   o  If a router which does not support specification defined in this
      draft becomes DR on link, it MUST be only DR on link as [RFC4601]
      and there would be no router which would act as GDR.

   o  If a router which does not support specification defined in this
      draft becomes non DR on link, then it should act as non-DR defined
      in [RFC4601].

8.  Manageability Considerations

   o  All of the routers in LAN that support this specification MUST use
      identical Hash Algorithm Type (described in section 5.1).  In the
      case of a hybrid Hash Algorithm Type, one MUST go backward to use
      DR election method defined in PIM-SM [RFC4601].  Migration between
      different algorithm type is out of the scope of this document.






Cai, et al.             Expires December 21, 2018              [Page 15]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


9.  IANA Considerations

   IANA has temporarily assigned type 34 for the PIM DR Load Balancing
   Capability (DRLBC) Hello Option, and type 35 for the PIM DR Load
   Balancing GDR (DRLBGDR) Hello Option.  IANA is requested to make
   these assignments permanent when this document is published as an
   RFC.  The string TBD should be replaced by the assigned values
   accordingly.

10.  Security Considerations

   Security of the new DR Load Balancing PIM Hello Options is only
   guaranteed by the security of PIM Hello message, so the security
   considerations for PIM Hello messages as described in PIM-SM
   [RFC4601] apply here.

11.  Acknowledgement

   The authors would like to thank Steve Simlo, Taki Millonis for
   helping with the original idea, Bill Atwood, Bharat Joshi for review
   comments, Toerless Eckert and Rishabh Parekh for helpful conversation
   on the document.

   Special thanks to Anish Kachinthaya, Anvitha Kachinthaya and Jake
   Holland for reviewing the document and providing comments.

12.  References

12.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
              Architecture", RFC 4291, DOI 10.17487/RFC4291, February
              2006, <https://www.rfc-editor.org/info/rfc4291>.

   [RFC4601]  Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
              "Protocol Independent Multicast - Sparse Mode (PIM-SM):
              Protocol Specification (Revised)", RFC 4601,
              DOI 10.17487/RFC4601, August 2006,
              <https://www.rfc-editor.org/info/rfc4601>.

   [RFC6395]  Gulrajani, S. and S. Venaas, "An Interface Identifier (ID)
              Hello Option for PIM", RFC 6395, DOI 10.17487/RFC6395,
              October 2011, <https://www.rfc-editor.org/info/rfc6395>.



Cai, et al.             Expires December 21, 2018              [Page 16]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


12.2.  Informative References

   [HELLO-OPT]
              IANA, "PIM Hello Options", IANA PIM-HELLO-OPTIONS, March
              2007.

Authors' Addresses

   Yiqun Cai
   Alibaba Group

   Email: yiqun.cai@alibaba-inc.com


   Heidi Ou
   Alibaba Group


   Sri Vallepalli
   Cisco Systems
   3625 Cisco Way,
   Sanjose, CALIFORNIA 95134
   UNITED STATES

   Email: svallepa@cisco.com


   Mankamana Mishra
   Cisco Systems
   821 Alder Drive,
   MILPITAS, CALIFORNIA 95035
   UNITED STATES

   Email: mankamis@cisco.com


   Stig Venaas
   Cisco Systems
   821 Alder Drive,
   MILPITAS, CALIFORNIA 95035
   UNITED STATES

   Email: stig@cisco.com








Cai, et al.             Expires December 21, 2018              [Page 17]


Internet-Draft    PIM Designated Router Load Balancing         June 2018


   Andy Green
   British Telecom
   Adastral Park
   Ipswich  IP5 2RE
   United Kingdom

   Email: andy.da.green@bt.com












































Cai, et al.             Expires December 21, 2018              [Page 18]