Network Working Group                                      Pedro Marques
Internet Draft                                             Robert Raszuk
Expiration Date: April 15, 2005                         Juniper Networks
                                                              Dan Tappan
                                                            Luca Martini
                                                      Cisco Systems Inc.
                                                            October 2004


       RFC2547bis networks using internal BGP as PE-CE protocol.


                    draft-marques-l3vpn-ibgp-00.txt

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 15, 2005.


Copyright Notice

   Copyright (C) The Internet Society (2004).






Marques, et al.                                                 [Page 1]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


Abstract

   This document defines protocol extensions and procedures for BGP PE-
   CE router iteration in RFC2547bis networks. These have the objective
   of making the usage of the RFC2547bis VPN transparent to the customer
   network, as far as routing information is concerned.



Table of Contents

 1      Introduction  ..............................................   2
 2      Route Reflection  ..........................................   3
 2.1    Carrying internal BGP routes  ..............................   4
 3      Next-hop handling  .........................................   5
 4      Exchanging routes between different VPN customer networks  .   6
 5      Acknowledgments  ...........................................   7
 6      References  ................................................   7
 7      Author's Addresses  ........................................   8





1. Introduction

   In current deployments, when BGP is used as the PE-CE routing proto-
   col, these peering sessions are typically configured as an external
   peering between the VPN provider AS and the customer network AS.

   A PE router advertising a route received from a remote PE often
   remaps the customer network autonomous-system number to its own. Oth-
   erwise the customer network can use different autonomous-system num-
   bers at different sites or configure their CE routers to accept
   routes containing their own AS number.

   While this technique works well in situations where there are no BGP
   routing exchanges between the client network and other networks, it
   does have drawbacks for customer networks that use BGP internally for
   purposes other than interaction between CE and PE routers.

   In order to make the usage of RFC2547bis VPN services as transparent
   as possible to any external interaction, it is desirable to define a
   mechanism by which PE-CE routers can exchange BGP routes by means
   other than external BGP.

   One can consider a RFC2547bis VPN as a provider-managed backbone ser-
   vice interconnecting several customer-managed sites. This model is



Marques, et al.                                                 [Page 2]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


   not universal but it is thought to be common enough to justify spe-
   cial attention.

   Independently of the presence of VPN service, networks which use an
   hierarchical design are typically modeled such that the top-level
   core or backbone participates in a full iBGP mesh which distributes
   routing information between sites via BGP route reflection [BGP-RR]
   or confederations [CONFED].


2. Route Reflection

   In a typical backbone/area hierarchical design, routers that attach
   an area (or site) to the core, use BGP route reflection to distribute
   routes between the top-level core iBGP mesh and the local area iBGP
   cluster.

   To provide equivalent functionality in a network using a provider
   provisioned backbone, one can consider the VPN network as the equiva-
   lent of an Internal BGP route server which multiplexes information
   from N VPN attachment points.

   A PE router then acts as a route reflector to local CE routers. Note
   that route reflection can be used hierarchically in order to avoid
   direct communication between the PE and non-directly connected CEs
   that may exist in the site.

   BGP path attributes are manipulated in order to isolate the VPN net-
   work from the customer network. A new BGP path attribute is defined
   that can act as an path attribute stack. At the ingress to the VPN
   network, the BGP attributes of the received routes are pushed into
   the stack. The stack is popped by a remote PE before performing route
   selection on the VRFs Adj-RIB-In.


                  --> push path attributes --> vrf-export --> 2547
        VRF route                                             PE-PE route
                                                              advertisement
                 <--  pop path attributes <--  vrf-import <--

                            Route processing

   The diagram above shows the BGP path attribute stack processing in
   relation to existing 2547 route processing procedures. BGP path
   attributes received from a customer network are pushed into the
   stack, before adding the Export Route Targets to the BGP path
   attributes.  Conversely, the stack is poped after the Import Target
   processing step that identifies the VRF table in which a PE received



Marques, et al.                                                 [Page 3]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


   route is accepted.

   When a PE received route is imported into a VRF, its IGP metric, as
   far as BGP path selection is concerned, should be the metric to the
   remote PE address, expressed in terms of the service provider metric
   domain.

   For the purposes of VRF route selection performed at the PE, between
   routes received from local CEs and remote PEs, VPN network IGP met-
   rics should always be considered higher (thus least preferred) than
   local site metrics.

   When backdoor links are present, this would tend to direct the traf-
   fic between two sites through the backdoor link for BGP routes origi-
   nated by a remote site. However BGP already has policy mechanisms to
   address this type of situations such as the LOCAL_PREF attribute.

   When a given CE is connected to more than one PE, it will not adver-
   tise the route that it receives from a PE to another PE unless con-
   figured as a route reflector, due to the standard BGP route adver-
   tisement rules.

   When a CE reflects a PE received route to another PE, the fact that
   the original attributes of a route are preserved across the VPN net-
   work prevents the formation of routing loops due to mutual redistri-
   bution between the two networks.


2.1. Carrying internal BGP routes

   In order to carry the original BGP attributes of a route received
   from a CE, this document defines a new BGP path attribute:

      ATTR_SET (type code 128)

         ATTR_SET is an optional transitive attribute that carries a set
         of BGP path attributes. An attribute set (ATTR_SET) can include
         any BGP attribute that can occur in a BGP UPDATE message,
         except the MP_REACH and MP_UNREACH attributes.

         This attribute consists of a 4-byte autonomous system number
         plus a variable length sequence of BGP path attributes.

   This attribute is used by a PE router to store the original set of
   BGP attributes it receives from a CE. When a PE router advertises a
   PE-received route to a CE, it will use the path attributes carried in
   the ATTR_SET attribute.




Marques, et al.                                                 [Page 4]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


   In other words, the BGP Path Attributes are "pushed" into this stack-
   like attribute when the route is received by the VPN network and
   "popped" when the route is advertised in the PE to CE direction.

   Using this mechanism isolates the customer network from the
   attributes used in the VPN network and vice versa. Attributes as the
   route reflection cluster list attribute are segregated such that cus-
   tomer network cluster identifiers won't be considered by the VPN net-
   work route reflectors and vice-versa.

   The autonomous system number present in the ATTR_SET attribute is
   designed to prevent a route originating in a given autonomous-system
   iBGP to be leaked into a different autonomous-system. It should con-
   tain the autonomous system of the customer network that originates
   the given set of attributes.

   The NEXT_HOP attribute SHOULD NOT be included in an ATTR_SET.


3. Next-hop handling

   When RFC2547bis VPNs are not in use, the NEXT_HOP attribute in iBGP
   routes carries the address of the border router advertising the route
   into the domain.

   An important component of BGP route selection is the IGP distance to
   the NEXT_HOP of the route.

   When a VPN service is used to provide interconnection between differ-
   ent sites, since the VPN network runs a different IGP domain, metrics
   between the VPN and customer networks are not comparable.

   However, the most important component of a metric is the inter-area
   metric, which is known to the VPN network. The intra-area metric is
   typically negligible.

   The use of route reflection, for instance, requires metrics to be
   configured so that inter-cluster/area metrics are always greater than
   intra-cluster metrics.

   The approach taken by this document is to rewrite the NEXT_HOP
   attribute at the PE-CE boundary. PE routers take into account the PE-
   PE IGP distance calculated by the VPN network IGP, when selecting
   between routes advertised from different PEs.

   An advantage of the proposed method is that the customer network can
   run independent IGPs at each site.




Marques, et al.                                                 [Page 5]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


4. Exchanging routes between different VPN customer networks

   A given VPN customer network SHOULD use internal or external BGP ses-
   sions consistently for peering sessions where the same autonomous
   system is used.

   In scenarios such as what is commonly referred to an "extranet" VPN,
   routes MAY be advertised to both internal and external VPN attach-
   ments, belonging to different autonomous systems.

                    +-----+                 +-----+
                    | PE1 |-----------------| PE2 |
                    +-----+                 +-----+
                   /       \                  |
            +-----+         +-----+         +-----+
            | CE1 |         | CE2 |         | CE3 |
            +-----+         +-----+         +-----+
             AS 1            AS 2             AS 1

   Consider the example given above where (PE1, CE1) and (PE2, CE3) ses-
   sions are iBGP.  In RFC2547 VPNs, a route received from CE1 above may
   be distributed to the VRFs corresponding to the attachment points for
   CEs 2 and 3.

   The desired result, in such a scenario is to present the internal
   peer (CE3) with a BGP advertisement that contains the same BGP Path
   Attributes received from CE1 and to the external peer (CE 2) a BGP
   advertisement that would correspond to a situation where AS 1 and 2
   have a external BGP session between them.

   It order to achieve this goal the following set of rules apply:

      When advertising an iBGP originated route to iBGP, a PE router
      MUST check that the autonomous-system contained in the ATTR_SET
      attribute matches the autonomous system of the CE to which the
      route is being advertised.

      In case the autonomous-systems do match, the route is advertised
      with the attributes contained in the ATTR_SET attribute.  Other-
      wise, in the case of an autonomous-system mismatch, the set of
      attributes to be advertised to the CE in question shall be con-
      structed as follows:

         1. The path attributes are set to the attributes contained in
         the ATTR_SET attribute.

         2. Internal BGP specific attributes are discarded (LOCAL_PREF,
         ORIGINATOR, CLUSTER_LIST, etc).



Marques, et al.                                                 [Page 6]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


         3. The autonomous-system contained in the ATTR_SET attribute is
         prepended to the as-path following the rules that would apply
         to an external BGP peering between the source and destination
         ASes.

         4. Internal BGP specific attributes corresponding to the con-
         figuration of destination AS (LOCAL_PREF) are added.

      When advertising an iBGP originated route to eBGP, a PE router
      shall apply steps 1 to 3 defined above and subsequently prepend
      its own autonomous-system number to the AS_PATH attribute (i.e.
      both the originator and VPN network as numbers are prepended).

      When advertising an eBGP originated route to iBGP, a PE router
      MUST prepend its own as number before adding iBGP only as-path
      attributes (LOCAL_PREF).

   In all cases where an iBGP originating route is processed, attributes
   present on the VPN route other than the NEXT_HOP attribute are
   ignored, both from the point of view of route selection in the VRF
   Adj-RIB-in and route advertisement to a CE router.


5. Acknowledgments

   We would like to thank Yakov Rekhter for his comments and sugges-
   tions. We would also like to acknowledge Luyuan Fang who provided
   valuable input into this work.


6. References

   [BGP-BASE] Y. Rekhter, T. Li, S. Hares, "A Border Gateway Protocol 4
        (BGP-4)", draft-ietf-idr-bgp4-20.txt, 03/03

   [RFC2547bis] "BGP/MPLS VPNs", Rosen et. al.,
        draft-ietf-ppvpn-rfc2547bis-03.txt, 10/02.

   [BGP-RR] Bates, Chandra, and Chen, "BGP Route Reflection: An
        alternative to full mesh IBGP", RFC 2796.

   [CONFED] P. Traina, D. McPherson, J. Scudder,
        "Autonomous System Confederations for BGP", RFC 3065.








Marques, et al.                                                 [Page 7]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


7. Author's Addresses

Pedro Marques
Juniper Networks
1194 N. Mathilda Ave.
Sunnyvale, CA 94089
E-mail: roque@juniper.net


Robert Raszuk
Cisco Systems, Inc.
170 West Tasman Dr
San Jose, CA 95134
Email: raszuk@cisco.com


Dan Tappan
Cisco Systems Inc.
300 Beaver Brook Rd.
Boxborough MA 01719
Email: tappan@cisco.com


Luca Martini
Cisco Systems, Inc.
9155 East Nichols Avenue, Suite 400
Englewood, CO, 80112
e-mail: lmartini@cisco.com



Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any assur-
   ances of licenses to be made available, or the result of an attempt
   made to obtain a general license or permission for the use of such
   proprietary rights by implementers or users of this specification can
   be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.



Marques, et al.                                                 [Page 8]


Internet Draft       draft-marques-l3vpn-ibgp-00.txt        October 2004


   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFOR-
   MATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
   OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.





















Marques, et al.                                                 [Page 9]