Skip to main content

Application-Initiated Flow High Availability Awareness through PCP
draft-vinapamula-flow-ha-07

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 7767.
Authors Suresh Vinapamula , Senthil Sivakumar , Mohamed Boucadair , Tirumaleswar Reddy.K
Last updated 2014-11-10
RFC stream (None)
Formats
IETF conflict review conflict-review-vinapamula-flow-ha, conflict-review-vinapamula-flow-ha, conflict-review-vinapamula-flow-ha, conflict-review-vinapamula-flow-ha, conflict-review-vinapamula-flow-ha, conflict-review-vinapamula-flow-ha
Additional resources
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Became RFC 7767 (Informational)
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-vinapamula-flow-ha-07
Network Working Group                                      S. Vinapamula
Internet-Draft                                          Juniper Networks
Intended status: Standards Track                            S. Sivakumar
Expires: May 14, 2015                                      Cisco Systems
                                                            M. Boucadair
                                                          France Telecom
                                                                T. Reddy
                                                                   Cisco
                                                       November 10, 2014

   Application-Initiated Flow High Availability Awareness through PCP
                      draft-vinapamula-flow-ha-07

Abstract

   This document specifies a mechanism for a host to signal via Port
   Control Protocol (PCP) which connections should be protected against
   network failures.  These connections will be elected to be subject to
   high availability mechanisms enabled at the network side.

   This approach assumes that applications/users have more visibility
   about sensitive connections rather than any heuristic that can be
   enabled at the network side to guess which connections should be
   secured.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 14, 2015.

Vinapamula, et al.        Expires May 14, 2015                  [Page 1]
Internet-Draft               HA through PCP                November 2014

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Issues with the existing implementations  . . . . . . . . . .   3
   3.  CHECKPOINT-REQUIRED PCP Option  . . . . . . . . . . . . . . .   4
     3.1.  Format  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.2.  Behavior  . . . . . . . . . . . . . . . . . . . . . . . .   5
   4.  Typical Usage Examples  . . . . . . . . . . . . . . . . . . .   6
   5.  Signaling HA for other Network Functions  . . . . . . . . . .   8
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   9
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
     9.1.  Normative references  . . . . . . . . . . . . . . . . . .   9
     9.2.  Informative References  . . . . . . . . . . . . . . . . .   9

1.  Introduction

   Internet service continuity is critical in Service Providers'
   environment and for Enterprise networks.  To achieve this, most
   Service Providers deploy active-backup systems.  This not only helps
   them in service continuity during failover, but also help in service
   continuity hitless upgrade or minimal hit upgrades of both software
   or hardware and achieve desired level of service continuity
   compliance.

   For some of the network functions, a state would be maintained for
   every connection for processing subsequent packets of that
   connection.  For service continuity of those connections on backup
   when active fail, that corresponding state had to be check-pointed on
   the backup.  NAPT is one such network function, where a state is
   maintained for every connection.

Vinapamula, et al.        Expires May 14, 2015                  [Page 2]
Internet-Draft               HA through PCP                November 2014

   Heuristic based on the protocol, mapping lifetime, etc are used in
   the network side to elect which connections are elected to High
   Availability (HA) means.  This document advocates for an application-
   initiated approach that would allow applications/user to signal to
   the network which of their connections are critical.

   PCP-initiated signaling is superior to heuristics deployed at the
   network side.

   This document specifies how PCP can be extended to signal which
   connection should be subject to HA mechanism.  This document does not
   make any assumption on the PCP-controlled device that will make use
   of the content of signals issued by PCP clients.  These devices are
   likely to be flow-aware.

   The proposed approach is aligned with the current networking trends
   advocating for open network APIs to interact with applications/
   services.  Policy-decision making process at the network side will be
   enriched with information signaled by application using PCP for
   instance.

2.  Issues with the existing implementations

   In a high availability (HA) deployment, it is expensive in terms of
   memory, CPU and other resources to checkpoint all connections state.
   Also check-pointing may not be required for all connections as all
   connections may not be critical.  But, this leaves a challenge to
   identify what connections to checkpoint.

   Typically, this is addressed by identifying long lived connections
   and check-pointing state of only those connections that lived long
   enough, to the backup for service continuity.

   However, following are the issues with that approach:

   1.  It is hard for a network to identify/guess which connection is
       (business) critical.  This characterization is mainly subscriber-
       specific: a flow can be sensitive for a User#1 while it is not
       for another User#2.  Furthermore, this characterization can vary
       in time: a flow can be sensitive in hour X, while it is not
       later.

   2.  Heuristics are not deterministic.

   3.  A connection which could potentially be long-lived would face
       disruption in service on failure of active system, before it had
       not lived long enough for it to be check-pointed.

Vinapamula, et al.        Expires May 14, 2015                  [Page 3]
Internet-Draft               HA through PCP                November 2014

   4.  A connection may not be long lived but critical like shorter
       Voice over (VoIP) conversations.

   5.  Similarly not every long lived connection need to be critical,
       say a free-service connection of a hosted service need not be
       check-pointed while a paid-service connection has to be check-
       pointed.

3.  CHECKPOINT-REQUIRED PCP Option

3.1.  Format

   This proposal is based on the assumption that an application or user
   is the best judge to decide which of its connections' are critical.

   An application/user may indicate the desire for checkpoint through
   PCP client, using the CHECKPOINT_REQUIRED option as described in
   Figure 1.

   The entry to be backed up is indicated by the content of a MAP or
   PEER message.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Option Code=TBA|  Reserved     |        Option Length          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

            Option Name: CHECKPOINT_REQUIRED
            Number: <TBA>
            Purpose:  Indicate if an entry needs to be check pointed.
            Valid for Opcodes: MAP, PEER
            Length: 0.
            May appear in: request, response.
            Maximum occurrences: 1.

                 Figure 1: CHECKPOINT_REQUIRED PCP Option

   The description of the fields is as follows:

   o  Option Code: To be assigned by IANA.

   o  Reserved: This field is initialized as specified in Section 7.3 of
      [RFC6887].

   o  Option Length: 0.  This means no data is included in the option.

Vinapamula, et al.        Expires May 14, 2015                  [Page 4]
Internet-Draft               HA through PCP                November 2014

   It was tempting to include additional fields in the option but this
   would lead to a more complex design that is not justified, e.g.,:

   o  Define a dedicated field to indicate a priority level.  This
      priority is intended to be used by the PCP server as a hint when
      processing a request with a CHECKPOINT_REQUIRED option.
      Nevertheless, an applications may systematically choose to set the
      priority level to the highest value so that it increases its
      chance to be serviced!

   o  Return a more granular failure error code to the requesting PCP
      client.  Nevertheless this would require extra processing at both
      the PCP client and server sides for handling the various error
      codes without any guarantee for the PCP client to have its
      mappings check-pointed.

   An application or user can use this option to indicate that one or
   more of its connections are critical and disruption is not desired.
   Doing so will trigger check-pointing of state to the backup.

   Communication between application/user and PCP client is
   implementation-specific.

3.2.  Behavior

   Support for the CHECKPOINT_REQUIRED option by PCP servers and PCP
   clients is optional.  This option (Code TBA; see Figure 1) MAY be
   included in a PCP MAP/PEER request to indicate a connection is to be
   protected against network failures.

   The PCP client includes a CHECKPOINT_REQUIRED option in a MAP or PEER
   request to signal that the corresponding mapping is to be protected.

   A PCP server MAY ignore the CHECKPOINT_REQUIRED option sent to it by
   a PCP client (e.g., if it does not support the option or if it is
   configured to ignore it).  To signal that it has not accepted the
   option, a PCP server simply does not include the CHECKPOINT_REQUIRED
   option in the response.  If the PCP client does not receive a
   CHECKPOINT_REQUIRED option in a response to a request enclosing a
   CHECKPOINT_REQUIRED option, this means the PCP server does not
   support the option or it is configured to ignore it.

   If the CHECKPOINT_REQUIRED option is not included in the PCP client
   request, the PCP server does not include the CHECKPOINT_REQUIRED
   option in the associated response.  This is mainly because there are
   not valid motivation that would justify a PCP server notify a PCP
   client about it reliability decision.

Vinapamula, et al.        Expires May 14, 2015                  [Page 5]
Internet-Draft               HA through PCP                November 2014

   When the PCP server receives a CHECKPOINT_REQUIRED option, the PCP
   server checks if it can honor this request depending on whether
   resources are available for check-pointing.  If there are no
   resources available for check-pointing, but there are resources
   available to honor the MAP/PEER request, a response is sent back to
   the PCP client without including the CHECKPOINT_REQUIRED option
   (i.e., the request is processed as any MAP/PEER request that does not
   convey a CHECKPOINT_REQUIRED option).  If check-pointing resources
   are still available and the quota for this PCP client is not reached,
   the PCP server tags the corresponding entry as eligible to HA
   mechanism and sends back the CHECKPOINT_REQUIRED option in the
   positive answer to the PCP client.

   To update the check-pointing behavior of a mapping maintained by the
   PCP server, the PCP client generates a PCP MAP/PEER renewal request
   that includes a CHECKPOINT_REQUIRED option to indicate this mapping
   has to be check-pointed or without including a CHECKPOINT_REQUIRED
   option to indicate this mapping need not be check-pointed anymore.
   Upon receipt of the PCP request, the PCP server proceeds to the same
   operations to validate a MAP/PEER request updating an existing
   mapping.  If validation checks are successfully passed, the PCP
   server updates the check-point flag associated with that mapping
   accordingly (i.e., it is set if a CHECKPOINT_REQUIRED option was
   included in the update request or it is cleared if no
   CHECKPOINT_REQUIRED option was included) , and the PCP server returns
   the response to the PCP client accordingly.

   What information to checkpoint and how to checkpoint is out of scope
   of this document, and is left for implementations.  Also, interest to
   indicate check-pointing by users/applications in a PCP request, may
   be automatic, semi-automatic, or human intervened.  This behavior is
   also left for application implementations.

   It is RECOMMENDED to checkpoint state on backup for honored requests
   before a response is sent to the PCP client.

4.  Typical Usage Examples

   Below are provided some examples for illustration purposes:

   Example 1:  Consider a streaming application that supports check-
      pointing signalling functionality.  Suppose, this application is
      installed in three hosts A, B and C.  For A it is critical and
      doesn't want interruption while for B it is not.  While for C,
      only some programs are of interest.  At the time of installing
      this application's software, corresponding preferences can be
      provisioned.  When the application starts streaming:

Vinapamula, et al.        Expires May 14, 2015                  [Page 6]
Internet-Draft               HA through PCP                November 2014

      *  All the flows associated with the streaming application are
         critical for A.  Limiting the number of flows to be backed up
         will ensure that host doesn't exceed the user's limit.

      *  In case of B, none of these flows are critical for check-
         pointing.  CHECKPOINT_REQUIRED option is not included in the
         PCP requests.

      *  In case of C, the user is invited to interact with the
         application by the means of a configuration option that is
         provided to dynamically select which streaming to checkpoint,
         based on the user's interest.

   Example 2:  Consider a streaming service offered by a provider.
      Suppose, three levels of subscriptions are offered by that
      provider: e.g., gold, silver, bronze.  To guarantee a certain
      level of quality of service for each subscription, policies are
      configured such that:

      *  All flows associated with a gold subscription should be check-
         pointed.

      *  Only some flows associated with a silver subscription are
         check-pointed.

      *  None of the flows associated with a bronze subscription are
         check-pointed.

      When a user invokes the streaming service, he/she may fall into
      one of those buckets, and according to the configured policy, his/
      her associated streaming flows are automatically check-pointed.
      Login credentials can be used as a trigger to determine the
      subscription level (and therefore the associated check-pointing
      behavior).

   Example 3:  Consider a VoIP application that is able to request its
      flows to be check-pointed.  No matter what is configured by the
      user, some calls such as emergency calls should be check-pointed.
      The application has to identify such calls.

   Example 4:  In the context of an enterprise network, applications are
      customized by the administrator.  Instructions whether a
      CHECKPOINT_REQUIRED option is to be included is determined by the
      administrator.  Only the subset of applications identified by the
      administrator will make use of this option in conformance with the
      enterprise network management policies.  Any mis-behavior can be
      considered as an abuse.

Vinapamula, et al.        Expires May 14, 2015                  [Page 7]
Internet-Draft               HA through PCP                November 2014

   In order to avoid that every application includes a
   CHECKPOINT_REQUIRED option in its PCP requests, the following items
   are assumed:

   o  Applications may be delivered with some default settings for
      check-pointing, and these settings should be programmable by end
      user.

   o  Exposing and enforcing these settings is application specific.

   o  End user may customize these settings on need basis based on his
      preferences.

5.  Signaling HA for other Network Functions

   In conjunction with NAT, other network functions that may maintain
   state for each connection such as stateful firewall may register to
   PCP server, and may be triggered for check-pointing respective state
   of that connection.

6.  Security Considerations

   PCP-related security considerations are discussed in [RFC6887].

   CHECKPOINT_REQUIRED option can be used by an attacker to identify
   critical flows.  This issue is mitigated if the network on which the
   PCP messages are to be sent is fully trusted.  Means to defend
   against attackers who can intercept packets between the PCP server
   and the PCP client should be enabled.  In some deployments, access
   control lists (ACLs) can be installed on the PCP client, PCP server,
   and the network between them, so those ACLs allow only communications
   between trusted PCP elements.  If the networking environment between
   the PCP client and PCP server is not secure, means to protect
   exposing the content of PCP messages (e.g., DTLS [RFC6347]) are
   recommended.

   A network device can always override the end-user signalling, i.e.,
   what is signaled by the PCP client, if the instructions are
   conflicting with the network policies.

   There is a risk that every PCP client may wish to checkpoint every
   connection, which can potentially load the system.  Administration
   SHOULD restrict the number of connections that can be elected to be
   backed up and the rate of check-pointing on per PCP client.

Vinapamula, et al.        Expires May 14, 2015                  [Page 8]
Internet-Draft               HA through PCP                November 2014

7.  IANA Considerations

   The following PCP Option Code is to be allocated in the optional-to-
   process range (the registry is maintained in http://www.iana.org/
   assignments/pcp-parameters):

      CHECKPOINT_REQUIRED set to TBA (see Section 3.1)

8.  Acknowledgements

   Thanks to Reinaldo Penno, Stuart Shechire, Dave Thaler, and Prashanth
   Patil for their comments.

9.  References

9.1.  Normative references

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC6887]  Wing, D., Cheshire, S., Boucadair, M., Penno, R., and P.
              Selkirk, "Port Control Protocol (PCP)", RFC 6887, April
              2013.

9.2.  Informative References

   [RFC6347]  Rescorla, E. and N. Modadugu, "Datagram Transport Layer
              Security Version 1.2", RFC 6347, January 2012.

Authors' Addresses

   Suresh Vinapamula
   Juniper Networks
   1194 North Mathilda Avenue
   Sunnyvale, CA  94089
   USA

   Phone: +1 408 936 5441
   EMail: sureshk@juniper.net

Vinapamula, et al.        Expires May 14, 2015                  [Page 9]
Internet-Draft               HA through PCP                November 2014

   Senthil Sivakumar
   Cisco Systems
   7100-8 Kit Creek Road
   Research Triangle Park, NC  27760
   USA

   Phone: +1 919 392 5158
   EMail: ssenthil@cisco.com

   Mohamed Boucadair
   France Telecom
   Rennes 35000
   France

   EMail: mohamed.boucadair@orange.com

   Tirumaleswar Reddy
   Cisco Systems, Inc.
   Cessna Business Park, Varthur Hobli
   Sarjapur Marathalli Outer Ring Road
   Bangalore, Karnataka  560103
   India

   EMail: tireddy@cisco.com

Vinapamula, et al.        Expires May 14, 2015                 [Page 10]