Network working group                                         L. Dunbar
Internet Draft                                                 A. Malis
Intended status: Standard Track                                  Huawei
Expires: October 2014


                                                         April 29, 2014



          Framework for Service Function Instances Restoration
           draft-dunbar-sfc-fun-instances-restoration-00.txt


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79. This document may not be modified,
   and derivative works of it may not be created, except to publish it
   as an RFC and to translate it into languages other than English.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on October 30, 2014.

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors. All rights reserved.





Dunbar, et al.         Expires October 29, 2014                [Page 1]


Internet-Draft    SF Instances Restoration Framework         April 2014


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document. Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.

Abstract

   This draft describes the framework of protection and restoration of
   Service Chain Instance Path when some instances on the path fail or
   need to be replaced.

Table of Contents


   1. Introduction...................................................2
   2. Conventions used in this document..............................3
   3. Background.....................................................4
      3.1. Multiple Instances of one Service Function................4
      3.2. Multiple ways for expressing Service Chain Instance Path..4
      3.3. Virtualized Service Function Instances impact to Service
      Chain..........................................................6
   4. Local Repair of Service Function Instances.....................7
   5. Global Repair of Service function instances....................8
   6. Regional Repair of Service function instances.................10
   7. Conclusion and Recommendation.................................10
   8. Manageability Considerations..................................10
   9. Security Considerations.......................................10
   10. IANA Considerations..........................................10
   11. References...................................................11
      11.1. Normative References....................................11
      11.2. Informative References..................................11
   12. Acknowledgments..............................................11

1. Introduction

   This draft describes the framework for protection and restoration of
   a Service Chain Instance Path when some instances on the path fail
   or need to be replaced.

   Protection and restoration become more crucial in virtualized
   environments (e.g. ETSI NFV), where there is higher chance of



Dunbar, et al.         Expires October 29, 2014                [Page 2]


Internet-Draft    SF Instances Restoration Framework         April 2014


   Service function instances failing, being decommissioned or over-
   utilized.



2. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [RFC2119].

   In this document, these words will appear with that interpretation
   only when in ALL CAPS. Lower case uses of these words are not to be
   interpreted as carrying RFC-2119 significance.



3. Definition of Terms

   NFV:     Network Function Virtualization [NFV-Terminology].

   SF:      Service Function [SFC-Problem].

   SFF:     Service Function Forwarder.

   SFIC:    Service Function Instance Component.  One service function
   (e.g. NAT44) could have two different service function
   instantiations, one that applies policy-set-A (NAT44-A) and other
   that applies policy-set-B (NAT44-B). There could be multiple
   "entities" of NAT44-B (e.g. one "entity" only has 10G capability),
   and many "entities" of NAT44-B. Each entity has its own unique
   address. The "entity" in this context is called "Service Function
   Instance Component" (SFIC).

   Service Chain: The sequence of service functions, e.g. Chain#1 {s1,
   s4, s6}, Chain#2{s4, s7} at functional level. Also see the
   definition of "Service Function Chain" in [SFC-Problem].

   Service Chain Instance Path: The actual Service Function Instance
   Components selected for a service chain.

   SFF:     Service Function Forwarding Node.

   VNF:     Virtualized Network Function [NFV-Terminology].





Dunbar, et al.         Expires October 29, 2014                [Page 3]


Internet-Draft    SF Instances Restoration Framework         April 2014


4. Background

4.1. Multiple Instances of one Service Function

   One service function (say, NAT44) could have two different service
   function instantiations, one that applies to policy-set-A (NAT44-A)
   and other that applies to policy-set-B (NAT44-B). There could be
   multiple "entities" of NAT44-A (e.g. one "entity" only has 10G
   capability), and many "entities" of NAT44-B. Each entity has its own
   unique address (or Locator in [SFC-Reduction]). The "Entity" in this
   context is called "Service Function Instance Component" (SFIC).

   Identical SFICs could be attached to different Service Function
   Forwarder (SFF) nodes. It is also possible to have multiple
   identical SFICs attached to one Service Function Forwarder (SFF)
   node, especially in a Network Function Virtualization (NFV)
   environment where each SFIC is a virtual service function instance
   with limited capacity.

   At the functional level, the order of service functions, e.g.
   Chain#1 {s1, s4, s6}, Chain#2{s4, s7}, is important, but very often
   which SFIC of the Service Function "s1" is selected for the Chain #1
   is not. It is also possible that multiple SFICs of one service
   function can be reached by different network nodes. The actual SFIC
   selected for a service chain is called "Service Chain Instance
   Path".

4.2. Multiple ways for expressing Service Chain Instance Path

   How SFICs are selected for a given Service Chain to form the actual
   Service Chain Instance Path is outside the scope of this draft. It
   is assumed that there is an entity (e.g. service chain orchestration
   system) that is responsible for selecting the SFICs for a Service
   Chain.

   This document focuses on how Service Function Forwarder nodes or
   network nodes are informed of the selected SFICs for a particular
   Service Chain, especially when there are changes of SFICs on the
   Service Chain. To make description easier, the following Service
   Chain architecture reference is used:









Dunbar, et al.         Expires October 29, 2014                [Page 4]


Internet-Draft    SF Instances Restoration Framework         April 2014


                              |1  -----   |n        |21   ---- |2m
                    +---+---+   +---+---+   +-+---+   +--+-----+
                    | SF#1  |   |SF#n   |   |SF#i1|   |SF#im   |
                    |       |   |       |   |     |   |        |
                    +---+---+   +---+---+   +--+--+   +--+--+--+
                        :           :          :         :  :
                        :           :          :         :  :
                         \         /            \       /
       +--------------+   +--------+             +---------+
   -- >| Chain        |   | SFF    |   ------    | SFF     | ---->
       |classifier    |   |Node-1  |             | Node-i  |
       +--------------+   +----+---+             +----+--+-+
                     \         |                     /
                      \        | SFC Encapsulation  /
                       \       |                   /
                ,. ......................................._
              ,-'                                        `-.
             /                                              `.
            |                      Network                   |
             `.                                             /
                `.__.................................. _,-'

                     Figure 1                                 Framework of Service Chain



   Some head end Service Chain Classifier can be configured with (or
   has the ability to specify) the exact Service Chain Instance Path
   for a given service chain. Under this scenario, the exact Service
   Chain Instance Path can be expressed by:

     - Being encoded in every data packet;
     - Being signaled in-band via the data path from the head end
       Service Chain Classifier node to all the relevant nodes to
       install the appropriate flow steering policies (similar to MPLS
       traffic engineering signaling);
     - Being sent as out-of-band control messages to all the relevant
       nodes to install the appropriate flow steering policies (similar
       to GMPLS signaling); or
     - Being provisioned into each node by a centralized network
       controller (similar to SDN) or by a network management system.






Dunbar, et al.         Expires October 29, 2014                [Page 5]


Internet-Draft    SF Instances Restoration Framework         April 2014


   The benefit of encoding the exact path in every data packet is less
   contention when there the Service Chain Instance Path changes.
   However, there are major drawbacks, such as

     - extra packet header fields are needed to carry the exact
       instance path, that can increase the likelihood of packet
       fragmentation due to MTU size, and
     - extra encapsulation processing load at the head end Service
       Chain classifier node.

  Packet fragmentation and reassembly is very processor and memory
  intensive. Good practice is to avoid packet fragmentation and
  reassembly as much as possible. Carry an exact instance path in every
  packet might be possible if service function instances can be
  represented by compact labels, similar to the MPLS label stack.

  When the in-band or out-of-band signaling methods are used, i.e.
  sending flow steering policies to relevant SFF nodes or network
  nodes, the packets associated with a specific flow can be classified
  with a simple identifier (or Service Chain ID). Packet size is
  smaller and processing at the SC Classifier can be simpler as well.

  The out-of-band method doesn't even require the head end Service
  Chain Classifier to be configured with, nor has the capability to
  specify, the exact Service Chain Instance Path. The out-of-band
  steering policies can be sent from an external entity, such as a
  centralized network controller or service chain orchestration system.
  Under this scenario, it doesn't require the head end Chain Classifier
  node to be aware of any change to the instances on the chain.

   At times it might not be feasible for the head end Service Chain
   Classifier to be aware of the exact instances selected for a given
   Service Chain because they are managed by different administrative
   entities.

   If each Service Function has a large number of SFICs, it scales
   better if the Service Chain classifier only identifies the service
   chain at the functional level, and there is another entity managing
   the detailed service instance path.



4.3. Virtualized Service Function Instances impact to Service Chain

   When Service Chain Instance Path consists of virtualized service
   function instances, e.g. in an ETSI NFV environment, the likelihood


Dunbar, et al.         Expires October 29, 2014                [Page 6]


Internet-Draft    SF Instances Restoration Framework         April 2014


   or frequent changes to the Service Chain Instance Path might be
   higher due to:

     - Higher failure rate of virtualized service function instances
       because most of them will not have build-in protection mechanism
     - When some instances are over-utilized, it is relatively easy to
       replace them by other instances or instantiate more instances to
       take over the work load.


5. Local Restoration of Service Function Instances

   When one SF Forwarder (SFF) node has multiple Service Function
   Instance Components (SFICs) of the same service function attached,
   the SFF can make a local decision on which instance is selected for
   a specific service chain.

   E.g. In the diagram below, The SF Forwarder (SFF) "A" has two
   instances of Service Function #7(SF7-1 & SF7-2), and 3 instances of
   Service Function #2 (SF2-2, SF2-4, SF2-5).

                        +----+  +---+   +---+   +---+
                        | SF2|  |SF2|   |SF2|   |SFx|
                        | -2 |  |-4 |   |-5 |   |-1 |
                        +----+  +---+   +---+   +---+
                           |      |       |       |
                           +------+-------+-------+
                                  |
                    +----+  +---+ | +---+   +---+
                    | SF7|  |SF7| | |SF5|   |SF5|
                    | -1 |  |-2 | | |-2 |   |-4 |
                    +----+  +---+ | +---+   +---+
                        :         / /       /
                        :        / / /-----/
                         \      / / /
       +--------------+   +----------       +----+
   -- >| Chain        |-- | SFF      |------| SFF| ---->
       |classifier    |   | A        |      | C  |
       +--------------+   +----------+      +----+

         Figure 2                     Local Restoration among multiple service instances




Dunbar, et al.         Expires October 29, 2014                [Page 7]


Internet-Draft    SF Instances Restoration Framework         April 2014


   For a service chain that consists of "Service Function #7" followed
   by "Service Function #2", which is represented by SF7->SF2, the
   steering policy to SFF "A" could be:

   {SF7-1, SF7-3} -> {SF2-2, SF2-4, SF2-5}.

   The multiple components within the {} represents the equal function
   instances that SFF "A" can select locally.

   When one service function instance fails, the SFF "A" can locally
   choose another instance without informing the SC Classifier node, or
   other SFF or network nodes.

   The local protection and restoration is relatively simple and clean.
   ECMP can be used to balance all the available service function
   instances attached locally.



6. Global Restoration of Service function instances

   Sometimes changing the Service Chain Instance Path involves using
   service function instances at different SF Forwarding (SSF) nodes.

   For example, for a Chain #7 -> #2 -> #3 -> #5 in the figure above:

       - Original instance path:  #7 & #2 at SFF "A"; #3 & #5 at SFF
          "C".

       - New instance path: #7 at SFF "A" and #2& #3 & #5 at SFF "C".

  This section examines possible ways to achieve the restoration when
  the change of instance path involves multiple nodes.

6.1. Encoding the Exact Instance Path in Data Packets

  If the detailed Service Chain Instance Path is encoded in data
  packets, the SC Classifier can be notified of the change and encode
  the new instance path in the data packets of the flow. This method
  won't cause any contention issue among all the involved nodes.

  As mentioned in the previous section, encoding exact instance path in
  every packet can cause packets fragmentation, which is very
  processing intensive. Therefore, it's not optimal to require every
  data packet to carry an exact instance path, especially when the
  Service Chain instance path doesn't change very frequently, as in
  minutes or hours.


Dunbar, et al.         Expires October 29, 2014                [Page 8]


Internet-Draft    SF Instances Restoration Framework         April 2014


6.2. In-Band Signaling of an Instance Path change

   A similar method to MPLS RSVP-TE [RSVP-TE] signaling can be
   considered for the head end node to signal a required service
   instance path, and then let the data packets traverse the
   established path.

   The drawback of this approach is that the head end node might
   receive packets belonging to the service chain before the instance
   path has been established. It is very similar to the issues
   encountered by MPLS Fast Reroute [FRR]. MPLS FRR requires that
   packets be dropped if a restoration path is being dynamically
   signaled because there was not a pre-established backup path..

6.3. Out-Of-Band Signaling of an Instance Path change

  If the out-of-band method is used, i.e. sending the updated flow
  steering policies to indicate the changes of the instance path, there
  could be issues of synchronization and race conditions. For example,
  if the SFF "A" and SFF "C" get flow steering policies at slightly
  different times, some packets of the flow might miss some service
  functions on the chain.

6.4. Provisioning an Instance Path change

  In SDN or SDN-like environments, changes to the Instance Path can be
  provisioned or programmed into network nodes via a central controller
  or Network Management System (NMS). This simplifies the nodes, since
  they are not required to use a signaling protocol, but there may be
  problems introduced (such as loops or dropped packets) if network
  nodes are not updated in the proper order or very soon to each other;
  the nodes should be updated in a similar time scale to the use of a
  signaling protocol. In addition, the network may have a single point
  of failure if the controller or NMS is not itself redundant.

6.5. Hybrid Method

  For global restoration of service function instances, it is
  worthwhile to explore a hybrid mode, i.e. when there are changes
  involving using service instances at different SFF nodes, the SC
  Classifier node is informed to encode the detailed instance path to
  data packets until all the involved SFF nodes complete the
  installation of the new steering policy for the flow.






Dunbar, et al.         Expires October 29, 2014                [Page 9]


Internet-Draft    SF Instances Restoration Framework         April 2014


7. Regional Restoration of Service Function Instances

   It might not be always be feasible for the head end Service Chain
   Classifier to be aware of the exact instances selected for a given
   Service Chain due to being managed by multiple administrative
   entities. Then Regional restoration should be considered.

   Regional restoration can take the similar approach as the Global
   restoration: choosing a regional ingress node that can take over the
   responsibility of installing the new steering policies to the
   involved SFF nodes or network nodes.

   The Regional ingress node should be:

       - on the data path of the flow of the given service chain;

       - in front of the relevant the SFF nodes or network nodes that
          are impacted by the change of the Service Chain Instance
          Path;

       - capable of encoding the detailed Service Chain Instance Path
          to the data packets of the identified flow; and

       - capable of removing the detailed Service Chain Instance Path
          encoding in data packets after all the impacted SFF nodes and
          network nodes completed the policy installation.





8. Conclusion and Recommendation

      TBD

9. Manageability Considerations

     TBD

10. Security Considerations

   TBD

11. IANA Considerations

   This document requires no IANA actions. RFC Editor: Please remove
   this section before publication.


Dunbar, et al.         Expires October 29, 2014               [Page 10]


Internet-Draft    SF Instances Restoration Framework         April 2014


12. References

12.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

12.2. Informative References

    [SFC-Problem] P. Quinn, et al, "Service Function Chaining Problem
             statement", draft-ietf-sfc-problem-statement-02, work in
             progress, April 2014

   [NFV-Terminology] ETSI NFV ISG, "Network Functions Virtualisation
             (NFV); Terminology for Main Concepts in NFV", ETSI GS NFV
             003 V1.1.1, Oct. 2013,
             http://www.etsi.org/deliver/etsi_gs/NFV/001_099/003/01.01.
             01_60/gs_NFV003v010101p.pdf

   [SFC-Reduction] R. Parker, "Service Function Chaining: Chain to Path
             Reduction", draft-parker-sfc-chain-to-path-00, work in
             progress, Nov. 2013

   [RSVP-TE] D. Awduche, Berger, L., Gan, D., Li, T., Srinivasan, V.,
             and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
             Tunnels", RFC 3209, December 2001.

   [FRR]    P. Pan, Swallow, G., and Atlas, A., "Fast Reroute
             Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May 2005

13.   Acknowledgments

   Many thanks to Ron Bonica for the discussion in formulating the
   content for the draft.

   This document was prepared using 2-Word-v2.0.template.dot.













Dunbar, et al.         Expires October 29, 2014               [Page 11]


Internet-Draft    SF Instances Restoration Framework         April 2014


Authors' Addresses

   Linda Dunbar
   Huawei Technologies
   5340 Legacy Drive, Suite 175
   Plano, TX 75024, USA
   Phone: (469) 277 5840
   Email: ldunbar@huawei.com


   USA
   Email: rbonica@juniper.net

   Andrew G. Malis
   Huawei Technologies
   USA
   Email: agmalis@gmail.com
































Dunbar, et al.         Expires October 29, 2014               [Page 12]