Network Working Group                                      Daniel Walton
Internet Draft                                                David Cook
Expiration Date: November 2002                             Alvaro Retana
File name: draft-walton-bgp-add-paths-00.txt                John Scudder
                                                           Cisco Systems
                                                                May 2002

                 Advertisement of Multiple Paths in BGP
                   draft-walton-bgp-add-paths-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet Drafts are working documents of the Internet Engineering
   Task Force (IETF), its Areas, and its Working Groups. Note that other
   groups may also distribute working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months. Internet Drafts may be updated, replaced, or obsoleted by
   other documents at any time. It is not appropriate to use Internet
   Drafts as reference material or to cite them other than as a "working
   draft" or "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   The BGP specification [1,2] defines an "Update-Send Process" to
   advertise the routes chosen by the Decision Process to other BGP
   speakers.  No provisions are made to facilitate the advertisement of
   multiple paths to the same destination.  In fact, a route with the
   same NLRI as a previously advertised route implicitly replaces the
   original advertisement.

   This document proposes a mechanism that will allow the advertisement
   of multiple paths for the same prefix without the new paths
   implicitly replacing any previous ones.  The essence of the mechanism
   is that each path is identified by an arbitrary identifier in
   addition to its prefix.






Walton, et al                                                   [Page 1]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


1. Specification of Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [3].


2. Advertisement of Multiple Paths in BGP

   This section describes an extension to the attributes developed for
   multiprotocol transport [5] that allows the advertisement of multiple
   paths in BGP.


2.1. Capability Advertisement

   This specification defines the capability [4] ADD_PATH.  This
   capability MUST NOT be advertised unless multiprotocol support [5] is
   also advertised.  The ADD_PATH capability has code TBD.  Its length
   is zero, there is no data.

   Capability code 4 defined in [6] MUST NOT be advertised if ADD_PATH
   is advertised (see also the section below entitled 'Modifications to
   "Carrying Label Information in BGP-4"').


2.2. NLRI Encoding

   If two BGP speakers advertise the ADD_PATH and multiprotocol
   capabilities to each other, the NLRI encoding is modified to add two
   new fields at the beginning of the NLRI -- a "bestpath" flag
   indicating if the NLRI has been selected for installation in the
   advertiser's FIB, and an identifier to distinguish the NLRI from
   other NLRI with the same prefix but different path attributes and/or
   nexthop.

   We note that in many BGP operations, the prefix is used as a key for
   identifying a datum, for example when withdrawing a route using the
   procedures of [1,2] only the prefix needs to be specified in order to
   withdraw the entire route.  For such purposes, the identifier field
   introduced by this specification is treated as part of the key.

   The following subsections specify the necessary modifications to
   existing encodings.  We recommend that future documents which specify
   NLRI encodings for BGP include an encoding (possibly the sole
   encoding) compatible with this specification.





Walton, et al                                                   [Page 2]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


2.2.1. Modifications to "Multiprotocol Extensions for BGP-4"

   "Multiprotocol Extensions for BGP-4" [5], section 7 is replaced by
   the following:

   The Network Layer Reachability information is encoded as one or more
   4-tuples of the form <bestpath, identifier, length, prefix>, whose
   fields are described below:

                       +---------------------------+
                       |   Bestpath (1 bit)        |
                       +---------------------------+
                       |   Identifier (15 bits)    |
                       +---------------------------+
                       |   Length (1 octet)        |
                       +---------------------------+
                       |   Prefix (variable)       |
                       +---------------------------+

   The use and the meaning of these fields are as follows:

      a) Bestpath:

         If set to one, the bestpath bit indicates that the path
         associated with the NLRI has been selected by the BGP speaker
         for installation into its FIB.  If set to zero, the path has
         not been selected.  The bestpath bit MUST NOT be used for
         identifying the path.  In other words, it does not form part of
         the key used to to identify the path.

         If a route which was advertised with the bestpath bit set to
         one is removed from the advertiser's FIB, the route MUST be
         re-advertised with the bestpath bit set to zero, or withdrawn.
         Likewise, if a route which was advertised with the bestpath bit
         set to zero is selected for installation in the advertiser's
         FIB, the route MUST be re-advertised with the bestpath bit set
         to one, or withdrawn.

      b) Identifier:

         The Identifier field allows the address prefix and its
         associated path attributes ("path") to be distinguished from
         other paths for the same prefix.  The selection of identifier
         values is a local implementation decision.

      c) Length:

         The Length field indicates the length in bits of the address



Walton, et al                                                   [Page 3]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


         prefix.  A length of zero indicates a prefix that matches all
         (as specified by the address family) addresses (with prefix,
         itself, of zero octets).

      d) Prefix:

         The Prefix field contains an address prefix followed by enough
         trailing bits to make the end of the field fall on an octet
         boundary.  Note that the value of trailing bits is irrelevant.


2.2.2. Modifications to "Carrying Label Information in BGP-4"

   "Carrying Label Information in BGP-4" [6] is modified as follows.
   Section 4 ("Advertising Multiple Routes to a Destination") is
   deleted, as the procedures of this specification allow multiple
   routes to be advertised, so no other procedures are required.  For
   the same reason, the final paragraph of Section 5 (which specifies
   capability code 4) is deleted.  Section 3 is replaced by the
   following:

   Label mapping information is carried as part of the Network Layer
   Reachability Information (NLRI) in the Multiprotocol Extensions
   attributes.  The AFI indicates, as usual, the address family of the
   associated route.  The fact that the NLRI contains a label is
   indicated by using SAFI value 4.

   The Network Layer Reachability information is encoded as one or more
   5-tuples of the form <bestpath, identifier, length, label, prefix>,
   whose fields are described below:

                       +---------------------------+
                       |   Bestpath (1 bit)        |
                       +---------------------------+
                       |   Identifier (15 bits)    |
                       +---------------------------+
                       |   Length (1 octet)        |
                       +---------------------------+
                       |   Label (3 octets)        |
                       +---------------------------+
                       +---------------------------+
                       |   Prefix (variable)       |
                       +---------------------------+

          The use and the meaning of these fields are as follows:

      a) Bestpath:




Walton, et al                                                   [Page 4]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


         If set to one, the bestpath bit indicates that the path
         associated with the NLRI has been selected by the BGP speaker
         for installation into its FIB.  If set to zero, the path has
         not been selected.  The bestpath bit MUST NOT be used for
         identifying the path.  In other words, it does not form part of
         the key used to to identify the path.

         If a route which was advertised with the bestpath bit set to
         one is removed from the advertiser's FIB, the route MUST be
         re-advertised with the bestpath bit set to zero, or withdrawn.
         Likewise, if a route which was advertised with the bestpath bit
         set to zero is selected for installation in the advertiser's
         FIB, the route MUST be re-advertised with the bestpath bit set
         to one, or withdrawn.

      b) Identifier:

         The Identifier field allows the address prefix and its
         associated path attributes ("path") to be distinguished from
         other paths for the same prefix.  The selection of identifier
         values is a local implementation decision.

      c) Length:

         The Length field indicates the length in bits of the address
         prefix plus the label(s).

      d) Label:

         The Label field carries one or more labels (that corresponds to
         the stack of labels [7]).  Each label is encoded as 3 octets,
         where the high-order 20 bits contain the label value, and the
         low order bit contains "Bottom of Stack" (as defined in [7]).

      e) Prefix:

         The Prefix field contains address prefixes followed by enough
         trailing bits to make the end of the field fall on an octet
         boundary.  Note that the value of trailing bits is irrelevant.

   The label(s) specified for a particular route (and associated with
   its address prefix) must be assigned by the LSR which is identified
   by the value of the Next Hop attribute of the route.

   When a BGP speaker redistributes a route, the label(s) assigned to
   that route must not be changed (except by omission), unless the
   speaker changes the value of the Next Hop attribute of the route.




Walton, et al                                                   [Page 5]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


   A BGP speaker can withdraw a previously advertised route (as well as
   the binding between this route and a label) by either (a) advertising
   a new route (and, optionally, a label) with the same NLRI as the
   previously advertised route (keeping in mind that the identifier
   comprises part of the NLRI for this purpose), or (b) listing the NLRI
   (again keeping in mind the inclusion of the identifier as part of the
   NLRI for this purpose) of the previously advertised route in the
   Withdrawn Routes field of an Update message.  In the latter case, no
   label information need be included.


2.3. Operation

   Using the identifier specified in the previous subsection, the same
   prefix can be advertised multiple times without subsequent
   advertisements replacing previous ones.  Apart from the fact that
   this is possible, the route advertisement rules of [1,2] are not
   changed.  In particular, a new advertisement of a given NLRI
   (remembering that the identifier is part of the NLRI's definition)
   replaces a previous advertisement of the given NLRI.

   This specification requires the use of multiprotocol encodings [5].
   When two BGP speakers have advertised the ADD_PATH and multiprotocol
   capabilities to each other, IPv4 unicast NLRI MUST be sent using the
   MP encoding of [5].  IPv4 unicast NLRI MUST NOT be sent using the
   encoding of [1,2].  Similarly, when two BGP speakers have advertised
   the ADD_PATH, multiprotocol and MP_CAP [6] capabilities to each
   other, the encoding of [6] MUST NOT be used, the encoding of this
   specification MUST be used instead.


3. Deployment Considerations

   The intent of this extension is to be used in a controlled fashion
   for applications that require only partial propagation of the routing
   information, or specific individual recipients.

   Care should be taken when deploying this enhancement.  If deployed
   improperly, the presence of extra paths in some parts of the AS and
   not in others can cause inconsistent routing.  One scenario of
   particular concern involves the IGP metric to the address depicted by
   the NEXT_HOP, and the MED attribute.  If this extension is used to
   advertise alternate paths, the best path [1,2] SHOULD also be
   advertised.  As long as the best path is still selected as best, the
   presence of additional paths in some parts of the AS and not others
   will not cause inconsistent routing.  However, if the IGP metric to
   the address depicted by the NEXT_HOP should change such that a non
   best path is now preferred over the best path, then every router in



Walton, et al                                                   [Page 6]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


   the path to the address depicted by the NEXT_HOP should have the
   additional paths.

   Because the MED is only compared between routes from the same AS
   [1,2], it is possible that an additional path could be selected as
   the best path. This may cause inconsistent routing if all routers in
   the forwarding path of the affected routers do not have the
   additional paths.

   In a simple topology, it may be possible to anticipate these
   scenarios and avoid inconsistent routing while still enabling
   appropriate applications. Documents proposing applications of this
   extension SHOULD specify restrictions for propagating additional
   paths and should supply specific deployment guidelines.


4. Security Considerations

   This document introduces no new security concerns to BGP or other
   specifications referenced in this document.


5. Acknowledgments

   We would like to thank Dave Meyer, Srihari Ramachandra, Eric Rosen,
   Dan Tappan, Robert Raszuk and Mark Turner for their comments and
   suggestions.


6. References

      [1]  Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4),"
           RFC 1771, March 1995.

      [2]  Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4),"
           Work in Progress (draft-ietf-idr-bgp4-17.txt), January 2002.

      [3]  Bradner, S., "Key words for use in RFCs to Indicate Require-
           ment Levels," RFC 2119, March 1997.

      [4]  Chandra, R. and J. Scudder, "Capabilities Advertisement with
           BGP-4," RFC 2842, May 2000.

      [5]  Bates, T., R. Chandra, D. Katz and Y. Rekhter, "Multiprotocol
           Extensions for BGP-4," RFC 2858, June 2000.

      [6]  Rekhter, R. and E. Rosen, "Carrying Label Information in
           BGP-4," RFC 3107, May 2001.



Walton, et al                                                   [Page 7]


INTERNET DRAFT           Multiple Paths in BGP                  May 2002


      [7]  Rosen, E., D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci,
           T. Li and A.  Conta, "MPLS Label Stack Encoding", RFC 3032,
           January 2001.


7. Authors' Addresses

         Daniel Walton
         Cisco Systems, Inc.
         7025 Kit Creek Rd.
         Research Triangle Park, NC 27709
         Email: dwalton@cisco.com

         Alvaro Retana
         Cisco Systems, Inc.
         7025 Kit Creek Rd.
         Research Triangle Park, NC 27709
         Email: aretana@cisco.com

         David Cook
         Cisco Systems, Inc.
         7025 Kit Creek Rd.
         Research Triangle Park, NC 27709
         Email: dacook@cisco.com

         John G. Scudder
         Cisco Systems, Inc.
         100 S. Main Suite 200
         Ann Arbor, MI 48104
         Email: jgs@cisco.com





















Walton, et al                                                   [Page 8]