Skip to main content

Scaling the Address Resolution Protocol for Large Data Centers (SARP)
draft-nachum-sarp-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 7586.
Authors Youval Nachum , Tal Mizrahi , Ilan Yerushalmi
Last updated 2012-03-04
RFC stream (None)
Formats
IETF conflict review conflict-review-nachum-sarp, conflict-review-nachum-sarp, conflict-review-nachum-sarp, conflict-review-nachum-sarp, conflict-review-nachum-sarp, conflict-review-nachum-sarp, conflict-review-nachum-sarp, conflict-review-nachum-sarp
Additional resources
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Became RFC 7586 (Experimental)
Telechat date (None)
Responsible AD (None)
Send notices to (None)
RFC Editor RFC Editor state ISR
Details
draft-nachum-sarp-00
Network Working Group                                      Youval Nachum
Internet Draft                                               Tal Mizrahi
Intended status: Informational                           Ilan Yerushalmi
Expires: September 2012                                          Marvell
                                                           March 4, 2012

       Scaling the Address Resolution Protocol for Large Data Centers
                                  (SARP)
                         draft-nachum-sarp-00.txt

Abstract

   This  document  provides  a  recommended  architecture  and  network
   operation  named  SARP.  SARP  is  based  on  fast  proxies  that
   significantly  reduce  broadcast  domains  and  ARP/ND  broadcast
   transmissions. SARP supports smooth and fast virtual machine (VM)
   mobility without any modification to the VM, while keeping the
   connection up and running efficiently.  SARP is targeted for massive
   scaling data centers with a significant number of VMs using ARP and
   ND protocols.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 4, 2012.

Nachum, et al.        Expires September 4, 2012               [Page 1]
Internet-Draft              Informational                   March 2012

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1. Introduction ................................................. 3
      1.1. SARP Motivation.......................................... 3
      1.2. SARP Overview ........................................... 3
      1.3. SARP Deployment Options ................................. 5
   2. Abbreviations Used in this Document .......................... 5
   3. SARP Description ............................................. 6
      3.1. Control Plane: ARP/ND ................................... 6
         3.1.1. ARP/ND Request for a Local VM ...................... 6
         3.1.2. ARP/ND Request for a Remote VM ..................... 6
      3.2. Data Plane: Packet Transmission ......................... 7
         3.2.1. Local Packet Transmission .......................... 7
         3.2.2. Packet Transmission Between Sites .................. 7
      3.3. VM Local Migration ...................................... 8
      3.4. VM Migration from One Site to Another ................... 8
         3.4.1. ARP/ND Table of Mobile VMs ......................... 9
      3.5. Multicast and Broadcast ................................ 10
      3.6. Non IP packet .......................................... 10
      3.7. ARP caching ............................................ 10
      3.8. SARP Interaction with Overlay networks ................. 10
   4. Security Considerations ..................................... 11
   5. IANA Considerations ......................................... 11
   6. References .................................................. 11
      6.1. Normative References ................................... 11
      6.2. Informative References ................................. 11
   7. Acknowledgments ............................................. 11

Nachum, et al.        Expires September 4, 2012               [Page 2]
Internet-Draft              Informational                   March 2012

1. Introduction

1.1. SARP Motivation

   SARP provides operational recommendations that mitigate performance
   derogation due to the data center architecture. SARP can be used in
   large data centers with large amount of VMs where VMs migrate from
   one system to another while keeping their network connections up and
   running. Data center operators are required to allow the VMs to keep
   their IP and MAC identity while migrating between systems. The direct
   outcome of having VMs keep their respective IP and MAC identities is
   that Layer 2 broadcast domains are scaling up and protocols such as
   [ARP] and [ND] cause network performance derogation. SARP addresses a
   scaling problem that is also discussed in [ARMD].

1.2. SARP Overview

   SARP uses FAST proxies that break down the large Layer 2 broadcast
   domains into small segments. The SARP proxies are located at the
   boundaries where the local Layer 2 infrastructure connects to its
   Layer 2 cloud. Figure 1 depicts an example of two remote data centers
   that are managed as a single flat Layer 2 domain. SARP proxies are
   implemented at the edge devices connecting the data center to the
   transport network. The direct outcome is significant reduction of
   broadcast domains and ARP/ND transmissions. The large L2 broadcast
   domains are bounded by the SARP proxies. ARP/ND transmissions are
   reduced due to the limited broadcast domains and the use of ARP/ND
   proxies and caching.

   SARP proxies enable fast migration of a VM between clouds and data
   centers, keeping their connections up and running while the mobile
   VMs retain their IP and MAC addresses.

Nachum, et al.        Expires September 4, 2012               [Page 3]
Internet-Draft              Informational                   March 2012

                         *-------------------*
                         |                   |
                 +-------|     TRANSPORT     |-------+
                 |       |                   |       |
                 |       *-------------------*       |
                 |                                   |
        *-----------------*                  *----------------*
        |  SARP Proxies   |                  |  SARP Proxies  |
        *-----------------*                  *----------------*
           |           |                        |           |
       *-------*   *-------*                *-------*   *-------*
       |  Agg  |   |  Agg  |                |  Agg  |   |  Agg  |
       *-------*   *-------*                *-------*   *-------*
           |
      *----------*
      |Hypervisor|
      *----------*
           |
       *--------*
       |Virtual |
       |Machine |
       *--------*

          (West Site)                          (East Site)

              Figure 1 SARP Networking Architecture Example.

   SARP distributes the Layer 2 Forwarding Information Base (FIB) from
   the edge devices (functioning as SARP proxies) to the VMs. By doing
   so, it significantly reduces table sizes on the edge devices. The
   source VM maintains the mapping of its destination VMs to the
   destination site/cloud in the ARP table. The destination VM IP is
   translated to the destination MAC address of the SARP proxy at the
   destination site. The SARP proxies only maintain Layer 2 FIB of local
   VMs and remote edge devices.

   SARP proxies can support FAST VM migration and provide minimum
   transition phase. When SARP proxy indicates or is informed of VM
   migration, it can update all its peers and triggers a fast update.

   SARP seamlessly supports Layer 2 network virtualization services over
   the overlay network and significantly reduces their complexity in
   terms of table size and performance. The overlay networks are only
   required to map MAC addresses of the SARP proxies to the correct
   tunnel.

Nachum, et al.        Expires September 4, 2012               [Page 4]
Internet-Draft              Informational                   March 2012

1.3. SARP Deployment Options

   SARP deployment is tightly coupled with the data center architecture.
   SARP  proxies  are  located  at  the  point  where  the  Layer  2
   infrastructure connects to its Layer 2 cloud using overlay networks.
   SARP proxies can be located at the data center edge (As Figure 1
   depicts), data center core, or data center aggregation. SARP can also
   be implemented by the hypervisor (As Figure 2 depicts).

   To simplify the description, we will focus on data centers that are
   managed as a single flat Layer 2 network, where SARP proxies are
   located at the boundary where the data center connects to the
   transport network (as Figure 1 depicts).

                         *-------------------*
                         |                   |
                 +-------|     TRANSPORT     |-------+
                 |       |                   |       |
                 |       *-------------------*       |
                 |                                   |
        *-----------------*                  *----------------*
        |   Edge Device   |                  |  Edge Device   |
        *-----------------*                  *----------------*
                 |                                   |
        *-----------------*                  *----------------*
        |       Core      |                  |      Core      |
        *-----------------*                  *----------------*
           |           |                        |           |
       *-------*   *-------*                *-------*   *-------*
       |  Agg  |   |  Agg  |                |  Agg  |   |  Agg  |
       *-------*   *-------*                *-------*   *-------*
           |
      *----------*
      |Hypervisor|
      *----------*

          (West Site)                          (East Site)

                     Figure 2 SARP deployment options.

2. Abbreviations Used in this Document

   ARP: Address Resolution Protocol

   FIB: Forwarding Information Base

Nachum, et al.        Expires September 4, 2012               [Page 5]
Internet-Draft              Informational                   March 2012

   IP-D: IP address of the destination virtual machine

   IP-S: IP address of the source virtual machine

   MAC-D: MAC address of the destination virtual machine

   MAC-E: MAC address of the East Proxy SARP Device

   MAC-S: MAC address of the source virtual machine

   ND: Neighbor Discovery

   SARP Proxy: The components that participate at SARP protocol.

   VM: Virtual Machine

3. SARP Description

3.1. Control Plane: ARP/ND

   This section describes the ARP/ND procedure scenarios. In the first
   scenario, VMs share the same site. In the second scenario, the source
   VM is local and the destination VM is located at the remote site.

   In all scenarios, the VMs (source and destination) share the same L2
   broadcast domain.

3.1.1. ARP/ND Request for a Local VM

   When source and destination VMs are located at the same site, the
   Address Resolution process is as described in [ARP]. When the VM
   sends an ARP request to learn the IP to MAC mapping of another local
   VM, it receives a reply from the other local VM with the IP-D to MAC-
   D mapping.

3.1.2. ARP/ND Request for a Remote VM

   When the source and destination VMs are located at different sites,
   the Address Resolution process is as follows.

   In our example, the source VM is located at the west site and the
   destination VM is located at the east site.

Nachum, et al.        Expires September 4, 2012               [Page 6]
Internet-Draft              Informational                   March 2012

   When the source VM sends an ARP/ND request to find out the IP to MAC
   mapping of a remote VM, the ARP request is propagated to the Layer 2
   broadcast domain in all sites, including the east site.

   The destination VM responds to the ARP/ND request and transmits an
   ARP/ND reply having the IP-D to MAC-D mapping.

   The east SARP proxy functions as the proxy ARP of its Local VMs. The
   east SARP proxy modifies the ARP reply message to be IP-D to MAC-E
   and forwards the modified ARP reply message to all the SARP proxies.

   The West SARP Proxy forwards the modified ARP reply message to the
   source VM.

   The west SARP proxy can also functions as an ARP cache of the Remote
   VMs. By doing so, it significantly reduces the volume of the ARP/ND
   transmission over the network.

3.2. Data Plane: Packet Transmission

3.2.1. Local Packet Transmission

   When a VM transmits packets to a destination VM that is located at
   the same site, there is no change in the data plane. The packets are
   sent from (IP-S, MAC-S) to (IP-D, MAC-D).

3.2.2. Packet Transmission Between Sites

   Packets that are sent between sites traverse the SARP proxy of both
   sites. In our example, all packets sent from the VM located at the
   west site to the destination VM located at the east site traverse the
   west SARP proxy and the east SARP proxy.

   The source VM follows its ARP table and sends packets to (IP-D, MAC-
   E) destination addresses and with (IP-s, MAC-S) as the source
   addresses.

   The west SARP proxy replaces the packet source address to its own
   source address (MAC-W), keeps the destination address to be (MAC-E),
   and forwards the packet to the east proxy SARP.

   When the east proxy SARP receives the packet, it replaces the
   destination MAC address to be (MAC-D) based on the packet destination
   IP (i.e., IP-D), but it does not change the source MAC addresses.

Nachum, et al.        Expires September 4, 2012               [Page 7]
Internet-Draft              Informational                   March 2012

3.3. VM Local Migration

   When a VM migrates locally within its site, the SARP protocol is not
   required to perform any action. VM migration is resolved entirely by
   the Layer 2 mechanisms.

3.4. VM Migration from One Site to Another

   VMs migration from one site to another is done seamlessly, without
   any changes to the VMs addressing at any level while keeping VMs
   connections up and running.

   In our example, the VM migrates from the west site to the east site.

   VM migration differently affects VMs and networking elements based on
   their respective location:

   -  Origin site (west site)

   -  Destination site (east site)

   -  Other sites

   Origin site:

   The Origin site is the site where the VM started its connections
   before the migration, west site in our example.

   All VMs at the west site that have an ARP entry of IP-D in their ARP
   table have the (IP-D to MAC-D) mapping. ARP mapping is updated by
   aging or by a gratuitous ARP message sent by the new hypervisor of
   the migrating VM and modified by the SARP proxy of the east site with
   (IP-D to MAC-E) mapping. Until ARP tables are updated, the source VMs
   from the west site continue sending packets to MAC-D. Switches at the
   west site are still configured with the old location of MAC-D. This
   can be resolved by MAC table aging or by redirecting the packets to
   the proxy SARP of the west site.

   Destination Site:

   The destination site is the site to which the VM migrated, the east
   site in our example.

   All VMs at the east site that have an ARP entry of IP-D in their ARP
   table have the (IP-D to MAC-W) mapping. ARP mapping is updated by
   aging or by a gratuitous ARP message sent by the hypervisor (IP-D to
   MAC-D) mapping. Until ARP tables are updated, the source VMs from the

Nachum, et al.        Expires September 4, 2012               [Page 8]
Internet-Draft              Informational                   March 2012

   west site continue to send packets to MAC-W. This can be resolved by
   redirecting the packets from the SARP proxy of the east site to the
   migrated VM by updating the destination MAC of the packets to MAC-D.

   Other Sites:

   All VMs at the other sites that have an ARP entry of IP-D in their
   ARP table have the (IP-D to MAC-W) mapping. ARP mapping is updated by
   aging or by a gratuitous ARP message sent by the new hypervisor of
   the migrated VM and modified by the SARP proxy of the east site (IP-D
   to MAC-E) mapping. Until ARP tables are updated, the source VMs from
   the west site continue sending packets to MAC-W. This can be resolved
   by redirecting the packets from the SARP proxy of the west site to
   the SARP proxy of the east site by updating the destination MAC of
   the packets to MAC-E.

3.4.1. ARP/ND Table of Mobile VMs

   The ARP table of the mobile VMs migrating from the west site to the
   east site includes the following types of VMs:

   -  Origin site (west site)

   -  Destination site (east site)

   -  Other Sites inhabitants

   The IP to MAC mapping of VMs located at the other sites is unaffected
   by the migration.

   The IP to MAC mapping of VMs located at east site can be kept with no
   change until the ARP aging time since they are mapped to MAC-E. All
   traffic from the migrated VM to VMs located at the east site
   traverses the SARP proxy of the east Site. This can be mitigated by
   ARP advertisement sent by the SARP proxy of the east site or by the
   hypervisor.

   IP to MAC mapping of VMs located at west sites can be kept with no
   change until the ARP entries age out. All MAC addresses of the VMs
   located at the west site are unknown at the east site. All unknown
   traffic from the VM is intercepted by the SARP proxy of the east site
   and forwarded to the SARP proxy of the west site (just for ARP aging
   time). This can be resolved earlier by the east SARP proxy. Upon
   receiving unknown packets, it can update the migrating VM with the
   new IP to MAC mapping by sending a modified gratuitous ARP with (IP-D
   to MAC-W) mapping.

Nachum, et al.        Expires September 4, 2012               [Page 9]
Internet-Draft              Informational                   March 2012

   Note  that  overlay  networks  providing  the  Layer  2  network
   virtualization services configure their Edge Device MAC aging timers
   to be greater than the ARP request interval.

3.5. Multicast and Broadcast

   To be added in a future version of this document

3.6. Non IP packet

   To be added in a future version of this document

3.7. ARP caching

   To be added in a future version of this document

3.8. SARP Interaction with Overlay networks

   SARP  interaction  with  overlay  networks  providing  L2  network
   virtualization (such as IP, VPLS, OTV, NVGRE and VxLAN) is efficient
   and scalable.

   The mapping of SARP to overlay networks is straightforward. The VM
   does the destination IP to SARP proxy MAC mapping. The mapping of the
   proxy MAC to its correct tunnel is done by the overlay networks. SARP
   significantly scales down the complexity of the overlay networks and
   transport networks by reducing the mapping tables to the number of
   SARP proxies.

Nachum, et al.        Expires September 4, 2012              [Page 10]
Internet-Draft              Informational                   March 2012

4. Security Considerations

   Security considerations will be added in a future version of this
   document.

5. IANA Considerations

   There are no IANA actions required by this document.

   RFC Editor: please delete this section before publication.

6. References

6.1. Normative References

   [ARP]         Plummer, D., "An Ethernet Address Resolution Protocol",
                 RFC 826, November 1982.

   [ND]          Narten, T., Nordmark, E., Simpson, W., and H. Soliman,
                 "Neighbor Discovery for IP version 6 (IPv6)", RFC
                 4861, September 2007.

6.2. Informative References

   [ARMD]        Narten, T., Karir, M., Foo, I., " Problem Statement for
                 ARMD", draft-ietf-armd-problem-statement, February
                 2012.

7. Acknowledgments

   This document was prepared using 2-Word-v2.0.template.dot.

Nachum, et al.        Expires September 4, 2012              [Page 11]
Internet-Draft              Informational                   March 2012

Authors' Addresses

   Youval Nachum
   Marvell
   6 Hamada St.
   Yokneam, 20692 Israel
   Email: youvaln@marvell.com

   Tal Mizrahi
   Marvell
   6 Hamada St.
   Yokneam, 20692 Israel
   Email: talmi@marvell.com

   Ilan Yerushalmi
   Marvell
   6 Hamada St.
   Yokneam, 20692 Israel
   Email: yilan@marvell.com

Nachum, et al.        Expires September 4, 2012              [Page 12]