Inter-Domain Routing P. Lapukhov
Internet-Draft Facebook
Intended status: Standards Track E. Aries, Ed.
Expires: August 6, 2016 P. Marques
Juniper Networks
E. Nkposong
Salesforce.com Inc
February 3, 2016
Use of BGP for Opaque Signaling
draft-lapukhov-bgp-opaque-signaling-01
Abstract
Border Gateway Protocol with multi-protocol extensions (MP-BGP)
enables the use of the protocol for dissemination of virtually any
information. This document proposes a new Address Family/Subsequent
Address Family along with new optional transitive attribute to be
used for distribution of opaque data. This functionality is intended
to be used by applications other than BGP for exchange of their own
data on top of BGP mesh. The structure of such data MAY to be
interpreted by the regular BGP speakers, rather the goal is to use
BGP purely as a convenient and scalable communication system.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 6, 2016.
Lapukhov, et al. Expires August 6, 2016 [Page 1]
Internet-Draft Use of BGP for Opaque Signaling February 2016
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. BGP Opaque Data AFI . . . . . . . . . . . . . . . . . . . . . 3
3. BGP Key-Value SAFI . . . . . . . . . . . . . . . . . . . . . 3
4. Capability Advertisement . . . . . . . . . . . . . . . . . . 3
5. Disseminating Key-Value bindings . . . . . . . . . . . . . . 3
5.1. Publishing a Key-Value binding . . . . . . . . . . . . . 4
5.2. Removing a Key-Value binding . . . . . . . . . . . . . . 5
5.3. Propagating multiple values for the same key . . . . . . 6
6. Message filtering . . . . . . . . . . . . . . . . . . . . . . 6
6.1. Automated filtering . . . . . . . . . . . . . . . . . . . 6
6.2. Filtering via policy . . . . . . . . . . . . . . . . . . 6
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7
8. Manageability Considerations . . . . . . . . . . . . . . . . 7
9. Security Considerations . . . . . . . . . . . . . . . . . . . 7
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 7
11.1. Normative References . . . . . . . . . . . . . . . . . . 7
11.2. Informative References . . . . . . . . . . . . . . . . . 7
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8
1. Introduction
Implementation of Multiprotocol Extensions for BGP-4 [RFC4760] gives
the ability to pass arbitrary data in BGP protocol messages. This
capability has been leveraged by many for dissemination of non-
routing related information over BGP (e.g. "Dissemination of Flow
Specification Rules" [RFC5575] as well as "North-Bound Distribution
of Link-State and TE Information using BGP"
[I-D.ietf-idr-ls-distribution]). However, there has been no channel
defined explicitly to disseminate data with arbitrary payload. The
intended use case is for applications other than BGP to leverage the
Lapukhov, et al. Expires August 6, 2016 [Page 2]
Internet-Draft Use of BGP for Opaque Signaling February 2016
protocol machinery for distribution (broadcasting) of their own state
in the network domain. Publishers and consumers will use BGP UPDATE
messages to submit and receive opaque data. It is up to the BGP
implementation to provide a custom API for message producers or
consumers if needed.
One application of this extension could be auto-discovery of various
services in the data-center network that uses BGP as the routing
protocol of choice ([I-D.ietf-rtgwg-bgp-routing-large-dc]).
Another application is building and testing new routing protocols or
BGP extensions within existing BGP implementation. The new protocol/
extension may influence routing either by directly communicating to
the RIB/FIB of the router it runs on, or by overriding BGP paths via
BGP route injection. An example of such BGP extension could be
[WISER]
2. BGP Opaque Data AFI
This document introduces a new AFI known as a "BGP Opaque Data AFI"
with the actual value to be assigned by IANA. The purpose of this
AFI is to exchange opaque information within a BGP network.
3. BGP Key-Value SAFI
This document introduces a new SAFI known as "BGP Key-Value SAFI"
with the actual value to be assigned by IANA. The purpose of this
SAFI is exchange of opaque information structured as a Key-Value
binding.
4. Capability Advertisement
A BGP speaker that wishes to exchange Opaque Data MUST use the
Multiprotocol Extensions Capability Code, as defined in [RFC4760], to
advertise the corresponding AFI/SAFI pair.
5. Disseminating Key-Value bindings
This document proposes to implement a distributed, eventually
consistent Key-Value store on top of existing BGP protocol mechanics.
The "Key" portion is to be encoded as the NLRI part of MP_REACH_NLRI
attribute and "Value" encoded using a new optional transitive
attribute.
o Publishers, acting as BGP speakers, advertise keys along with
associated values into the routing domain. The BGP network
synchronizes that state by propagating the encoded data following
regular BGP protocol operations.
Lapukhov, et al. Expires August 6, 2016 [Page 3]
Internet-Draft Use of BGP for Opaque Signaling February 2016
o Consumers, acting as BGP speakers, receive the information via BGP
protocol UPDATE messages. Only publishers and consumers of the
opaque data are supposed to interpret its contents - the rest of
the BGP network acts merely as a dissemination system.
Multiple publishers can advertise the same key (NLRI) bound to
different values. It is also possible for the advertised binding to
have the same Key-Value pairs but differ in some other BGP
attributes. In that case, BGP would follow the best-path selection
logic to prevent duplicate information in the network. A consumer
will receive the value created by the publisher "closest" in terms of
BGP best-path selection logic, based on the policies that exist in
the routing domain. This document does not propose any method of
achieving global consensus for all published values for a given key.
5.1. Publishing a Key-Value binding
The encoding scheme proposed below follows the semantics of a Key-
Value bindings. The "Key" is stored in the NLRI section of the
MP_REACH_NLRI attribute, as shown on Figure 1.
+---------------------------------------------------------+
| Address Family Identifier (2 octets) |
+---------------------------------------------------------+
| Subsequent Address Family Identifier (1 octet) |
+---------------------------------------------------------+
| Length of Next Hop Address (1 octet), must be zero |
+---------------------------------------------------------+
| Reserved (1 octet), must be zero |
+---------------------------------------------------------+
| Opaque Key Length (1 octet) |
+---------------------------------------------------------+
| Opaque Key Data (variable) |
+---------------------------------------------------------+
Figure 1: MP_REACH_NLRI Layout
o The AFI/SAFI values are to be allocated by IANA.
o Length of Next Hop Address: must be zero, since no information is
encoded in the next-hop address field.
o Opaque Key Length: identifies the size of the Key field. If field
is set to zero, the implementation MUST ignore the advertisement.
o Opaque Key Data: the byte string representing the opaque key
contents. This portion SHOULD NOT be interpreted by BGP
implementation.
Lapukhov, et al. Expires August 6, 2016 [Page 4]
Internet-Draft Use of BGP for Opaque Signaling February 2016
The "Value" portion of a published binding is to be encoded in a new
optional transitive attribute as shown on Figure 2:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type |0 0 0 0| Opaque Value Length | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
~ ~
| Opaque Value Data (variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+..........................
Figure 2: OPAQUE_VALUE attribute layout
o Type: Identifies the new OPAQUE_VALUE attribute, with the value to
be allocated by IANA.
o Opaque Value Length: Two octets encoding the total length of the
attribute in octets, including the Type and Length fields. The
length is encoded as an unsigned binary integer. The four most
significant bits of this field MUST be set to zero, due to the
limit imposed by maximum BGP message size. Note that the minimum
length is 3, indicating that no Opaque Value Data field is
present. Such binding, in presence of non-zero length key is
still valid, as it informs the consumers that the key "exists".
o Opaque Value Data: A field containing zero or more octets. This
portion SHOULD NOT be interpreted by BGP implementations.
Even when the OPAQUE_VALUE optional transitive attribute is not
present in BGP advertisement, the BGP implementation MUST still
retain Opaque Key (NLRI) in its LocRIB and propagate it further as
usual. This case is to be interpreted as an announcement of the key
existence.
5.2. Removing a Key-Value binding
The removal procedure follows the regular MP-BGP route withdrawal,
using the MP_UNREACH_NLRI attribute. This section defines the
attribute structure for the new AFI/SAFI.
The message shown on Figure 3 instructs the receiving BGP speaker to
delete the N bindings corresponding to Key 1, Key 2 ... Key N if the
keys have been previously learned from the withdrawing speaker. If
any of the Keys is not found in the LocRIB or has not been previously
received from the withdrawing BGP peer, such key removal request MUST
be ignored.
Lapukhov, et al. Expires August 6, 2016 [Page 5]
Internet-Draft Use of BGP for Opaque Signaling February 2016
+---------------------------------------------------------+
| Address Family Identifier (2 octets) |
+---------------------------------------------------------+
| Subsequent Address Family Identifier (1 octet) |
+---------------------------------------------------------+
| Opaque Key 1 Length (1 octet) |
+---------------------------------------------------------+
| Opaque Key 1 Data (variable) |
+---------------------------------------------------------+
~ ~
| Opaque Key N Length (1 octet) |
+---------------------------------------------------------+
| Opaque Key N Data (variable) |
+---------------------------------------------------------+
Figure 3: MP_UNREACH_NLRI attribute layout
5.3. Propagating multiple values for the same key
It is possible to propagate multiple values associated with the same
key using the Add-Path extension defined in [I-D.ietf-idr-add-paths].
However, this document recommends that instead unique key values be
used for this purpose. It is up to the consumers and publishers of
the opaque data to settle on single unique value using some kind of
consensum protocol.
6. Message filtering
Limiting the scope of opaque information flooding is an important
operational concern. BGP already has the mechanisms needed to
control this process, and these mechanisms are briefly reviewed
below.
6.1. Automated filtering
One can leverage mechanics presented in [RFC4684] and use the route-
target extended community attribute to identify "channels" where key-
value bindings are published. The consumers would signal their
interest in particular "channel" by advertising the corresponding
router-target membership. The publications then need to contain the
router-target extended community attribute to constrain information
propagation.
6.2. Filtering via policy
Ad-doc message filtering could be implemented using BGP standard (see
[RFC4271]) or extended community attributes (see [RFC4360]). The
semantic of these attributes is to determined by the policy and
Lapukhov, et al. Expires August 6, 2016 [Page 6]
Internet-Draft Use of BGP for Opaque Signaling February 2016
publishers/consumers. Filtering could be done locally on receiving
speaker, or on remote speaker, by using outbound route filtering
feature defined in [RFC5291].
7. IANA Considerations
For the purpose of this work, IANA would be asked to allocate values
for the new AFI and SAFI, as well as a value for the new optional
transitive attribute.
8. Manageability Considerations
TBD
9. Security Considerations
This document does not introduce any changes in terms of BGP
security.
10. Acknowledgements
TBD
11. References
11.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
Border Gateway Protocol 4 (BGP-4)", RFC 4271,
DOI 10.17487/RFC4271, January 2006,
<http://www.rfc-editor.org/info/rfc4271>.
11.2. Informative References
[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
Communities Attribute", RFC 4360, DOI 10.17487/RFC4360,
February 2006, <http://www.rfc-editor.org/info/rfc4360>.
Lapukhov, et al. Expires August 6, 2016 [Page 7]
Internet-Draft Use of BGP for Opaque Signaling February 2016
[RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk,
R., Patel, K., and J. Guichard, "Constrained Route
Distribution for Border Gateway Protocol/MultiProtocol
Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual
Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684,
November 2006, <http://www.rfc-editor.org/info/rfc4684>.
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 4760,
DOI 10.17487/RFC4760, January 2007,
<http://www.rfc-editor.org/info/rfc4760>.
[RFC5291] Chen, E. and Y. Rekhter, "Outbound Route Filtering
Capability for BGP-4", RFC 5291, DOI 10.17487/RFC5291,
August 2008, <http://www.rfc-editor.org/info/rfc5291>.
[RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J.,
and D. McPherson, "Dissemination of Flow Specification
Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009,
<http://www.rfc-editor.org/info/rfc5575>.
[I-D.ietf-idr-add-paths]
Walton, D., Retana, A., Chen, E., and J. Scudder,
"Advertisement of Multiple Paths in BGP", draft-ietf-idr-
add-paths-13 (work in progress), December 2015.
[I-D.ietf-idr-ls-distribution]
Gredler, H., Medved, J., Previdi, S., Farrel, A., and S.
Ray, "North-Bound Distribution of Link-State and TE
Information using BGP", draft-ietf-idr-ls-distribution-13
(work in progress), October 2015.
[I-D.ietf-rtgwg-bgp-routing-large-dc]
Lapukhov, P., Premji, A., and J. Mitchell, "Use of BGP for
routing in large-scale data centers", draft-ietf-rtgwg-
bgp-routing-large-dc-07 (work in progress), August 2015.
[WISER] Mahajan, R., Wetherall, D., and T. Anderson, "Mutually
Controlled Routing with Independent ISPs", 2007,
<http://research.microsoft.com/en-
us/um/people/ratul/papers/nsdi2007-wiser.pdf>.
Authors' Addresses
Lapukhov, et al. Expires August 6, 2016 [Page 8]
Internet-Draft Use of BGP for Opaque Signaling February 2016
Petr Lapukhov
Facebook
1 Hacker Way
Menlo Park, CA 94025
US
Email: petr@fb.com
Ebben Aries (editor)
Juniper Networks
1133 Innovation Way
Sunnyvale, CA 94089
US
Email: exa@juniper.net
Pedro Marques
Juniper Networks
1194 N. Mathilda Ave
Sunnyvale, CA 94089
US
Email: roque@juniper.net
Edet Nkposong
Salesforce.com Inc
The Landmark @ One Market, ST 300
San Francisco, CA 94105
US
Email: enkposong@salesforce.com
Lapukhov, et al. Expires August 6, 2016 [Page 9]