Skip to main content

A Traffic-Based Method of Detecting Dead Internet Key Exchange (IKE) Peers
RFC 3706

Document Type RFC - Informational (February 2004) Errata
Authors Stephane Beaulieu , Geoffrey Huang , Dany Rochefort
Last updated 2015-10-14
RFC stream Internet Engineering Task Force (IETF)
Formats
Additional resources Mailing list discussion
IESG Responsible AD Russ Housley
Send notices to (None)
RFC 3706
Network Working Group                                           G. Huang
Request for Comments: 3706                                   S. Beaulieu
Category: Informational                                     D. Rochefort
                                                     Cisco Systems, Inc.
                                                           February 2004

           A Traffic-Based Method of Detecting Dead Internet
                       Key Exchange (IKE) Peers

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2004).  All Rights Reserved.

Abstract

   This document describes the method detecting a dead Internet Key
   Exchange (IKE) peer that is presently in use by a number of vendors.
   The method, called Dead Peer Detection (DPD) uses IPSec traffic
   patterns to minimize the number of IKE messages that are needed to
   confirm liveness.  DPD, like other keepalive mechanisms, is needed to
   determine when to perform IKE peer failover, and to reclaim lost
   resources.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
   2.  Document Roadmap . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Rationale for Periodic Message Exchange for Proof of
       Liveliness . . . . . . . . . . . . . . . . . . . . . . . . . .  3
   4.  Keepalives vs.  Heartbeats . . . . . . . . . . . . . . . . . .  3
       4.1.  Keepalives . . . . . . . . . . . . . . . . . . . . . . .  3
       4.2.  Heartbeats . . . . . . . . . . . . . . . . . . . . . . .  5
   5.  DPD Protocol . . . . . . . . . . . . . . . . . . . . . . . . .  6
       5.1.  DPD Vendor ID. . . . . . . . . . . . . . . . . . . . . .  7
       5.2.  Message Exchanges. . . . . . . . . . . . . . . . . . . .  7
       5.3.  NOTIFY(R-U-THERE/R-U-THERE-ACK) Message Format . . . . .  8
       5.4.  Impetus for DPD Exchange . . . . . . . . . . . . . . . .  9
       5.5.  Implementation Suggestion. . . . . . . . . . . . . . . .  9
       5.6.  Comparisons. . . . . . . . . . . . . . . . . . . . . . . 10
   6.  Resistance to Replay Attack and False Proof of Liveliness. . . 10
       6.1.  Sequence Number in DPD Messages. . . . . . . . . . . . . 10

Huang, et al.                Informational                      [Page 1]
RFC 3706                Detecting Dead IKE Peers           February 2004

       6.2.  Selection and Maintenance of Sequence Numbers. . . . . . 11
   7.  Security Considerations. . . . . . . . . . . . . . . . . . . . 11
   8.  IANA Considerations. . . . . . . . . . . . . . . . . . . . . . 12
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
       9.1.  Normative Reference. . . . . . . . . . . . . . . . . . . 12
       9.2.  Informative References . . . . . . . . . . . . . . . . . 12
   10. Editors' Addresses . . . . . . . . . . . . . . . . . . . . . . 12
   11. Full Copyright Statement . . . . . . . . . . . . . . . . . . . 13

1.  Introduction

   When two peers communicate with IKE [2] and IPSec [3], the situation
   may arise in which connectivity between the two goes down
   unexpectedly.  This situation can arise because of routing problems,
   one host rebooting, etc., and in such cases, there is often no way
   for IKE and IPSec to identify the loss of peer connectivity.  As
   such, the SAs can remain until their lifetimes naturally expire,
   resulting in a "black hole" situation where packets are tunneled to
   oblivion.  It is often desirable to recognize black holes as soon as
   possible so that an entity can failover to a different peer quickly.
   Likewise, it is sometimes necessary to detect black holes to recover
   lost resources.

   This problem of detecting a dead IKE peer has been addressed by
   proposals that require sending periodic HELLO/ACK messages to prove
   liveliness.  These schemes tend to be unidirectional (a HELLO only)
   or bidirectional (a HELLO/ACK pair).  For the purpose of this
   document, the term "heartbeat", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in RFC
   2119 [RFC2119].

Maenpaa & Camarillo     Expires December 28, 2014               [Page 3]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   This document uses terminology and definitions from the RELOAD base
   specification [RFC6940].

   numBitsInNodeId:  Specifies the number of bits in a RELOAD Node-ID.

   DHT:  Distributed Hash Tables (DHTs) are a class of decentralized
      distributed systems that provide a lookup service similar to a
      regular hash table.  Given a key, any peer participating in the
      system can retrieve the value associated with that key.  The
      responsibility for maintaining the mapping from keys to values is
      distributed among the peers.

   Chord Ring:  The Chord DHT uses ring topology and orders identifiers
      on an identifier circle of size 2^numBitsInNodeId.  This
      identifier circle is called the Chord ring.  On the Chord ring,
      the responsibility for a key k is assigned to the node whose
      identifier equals to or immediately follows k.

   Finger Table:  A data structure with up to (but typically less than)
      numBitsInNodeId entries maintained by each peer in a Chord-based
      overlay.  The ith entry in the finger table of peer n contains the
      identity of the first peer that succeeds n by at least
      2^(numBitsInNodeId-i) on the Chord ring.  This peer is called the
      ith finger of peer n.  As an example, the first entry in the
      finger table of peer n contains a peer half-way around the Chord
      ring from peer n.  The purpose of the finger table is to
      accelerate lookups.

   n.id:  An abbreviation that is in this document used refer to the
      Node-ID of peer n.

   O(g(n)):  Informally, saying that some equation f(n) = O(g(n)) means
      that f(n) is less than some constant multiple of g(n).  For the
      formal definition, please refer to [weiss1998].

Maenpaa & Camarillo     Expires December 28, 2014               [Page 4]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   Omega(g(n)):  Informally, saying that some equation f(n) =
      Omega(g(n)) means that f(n) is more than some constant multiple of
      g(n).  For the formal definition, please refer to [weiss1998]

   Percentile:  The Pth (0<=P<=100) percentile of N values arranged in
      ascending order is obtained by first calculating the (ordinal)
      rank n=(P/100)*N, rounding the result to the nearest integer, and
      then taking the value corresponding to that rank.

   Predecessor List:  A data structure containing the first r
      predecessors of a peer on the Chord ring.

   Successor List:  A data structure containing the first r successors
      of a peer on the Chord ring.

   Neighborhood Set:  A term used to refer to the set of peers included
      in the successor and predecessor lists of a given peer.

   Routing Table:  Contents of a given peer's routing table include the
      set of peers that the peer can use to route overlay messages.  The
      routing table is made up of the finger table, successor list and
      predecessor list.

3.  Introduction to Stabilization in DHTs

   DHTs use stabilization routines to counter the undesirable effects of
   churn on routing.  The purpose of stabilization is to keep the
   routing information of each peer in the overlay consistent with the
   constantly changing overlay topology.  There are two alternative
   approaches to stabilization: periodic and reactive [rhea2004].
   Periodic stabilization can either use a fixed stabilization rate or
   calculate the stabilization rate in an adaptive fashion.

Maenpaa & Camarillo     Expires December 28, 2014               [Page 5]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

3.1.  Reactive vs. Periodic Stabilization

   In reactive stabilization, a peer reacts to the loss of a peer in its
   neighborhood set or to the appearance of a new peer that should be
   added to its neighborhood set by sending a copy of its neighbor table
   to all peers in the neighborhood set.  Periodic recovery, in
   contrast, takes place independently of changes in the neighborhood
   set.  In periodic recovery, a peer periodically shares its
   neighborhood set with each or a subset of the members of that set.

   The chord-reload algorithm [RFC6940] supports both reactive and
   periodic stabilization.  It has been shown in [rhea2004] that
   reactive stabilization works well for small neighborhood sets (i.e.,
   small overlays) and moderate churn.  However, in large-scale (e.g.,
   1000 peers or more [rhea2004]) or high-churn overlays, reactive
   stabilization runs the risk of creating a positive feedback cycle,
   which can eventually result in congestion collapse.  In [rhea2004],
   it is shown that a 1000-peer overlay under churn uses significantly
   less bandwidth and has lower latencies when periodic stabilization is
   used than when reactive stabilization is used.  Although in the
   experiments carried out in [rhea2004], reactive stabilization
   performed well when there was no churn, its bandwidth use was
   observed to jump dramatically under churn.  At higher churn rates and
   larger scale overlays, periodic stabilization uses less bandwidth and
   the resulting lower contention for the network leads to lower
   latencies.  For this reason, most DHTs such as CAN [CAN], Chord
   [Chord], Pastry [Pastry], Bamboo [rhea2004], etc. use periodic
   stabilization [ghinita2006].  As an example, the first version of
   Bamboo used reactive stabilization, which caused Bamboo to suffer
   from degradation in performance under churn.  To fix this problem,
   Bamboo was modified to use periodic stabilization.

   In Chord, periodic stabilization is typically done both for
   successors and fingers.  An alternative strategy is analyzed in
   [krishnamurthy2008].  In this strategy, called the correction-on-
   change maintenance strategy, a peer periodically stabilizes its
   successors but does not do so for its fingers.  Instead, finger
   pointers are stabilized in a reactive fashion.  The results obtained
   in [krishnamurthy2008] imply that although the correction-on-change
   strategy works well when churn is low, periodic stabilization
   outperforms the correction-on-change strategy when churn is high.

3.2.  Configuring Periodic Stabilization

   When periodic stabilization is used, one faces the problem of
   selecting an appropriate execution rate for the stabilization
   procedure.  If the execution rate of periodic stabilization is high,
   changes in the system can be quickly detected, but at the

Maenpaa & Camarillo     Expires December 28, 2014               [Page 6]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   disadvantage of increased communication overhead.  Alternatively, if
   the stabilization rate is low and the churn rate is high, routing
   tables become inaccurate and DHT performance deteriorates.  Thus, the
   problem is setting the parameters so that the overlay achieves the
   desired reliability and performance even in challenging conditions,
   such as under heavy churn.  This naturally results in high cost
   during periods when the churn level is lower than expected, or
   alternatively, poor performance or even network partitioning in worse
   than expected conditions.

   In addition to selecting an appropriate stabilization interval,
   regardless of whether periodic stabilization is used or not, an
   appropriate size needs to be selected for the neighborhood set and
   for the finger table.

   The current approach is to configure overlays statically.  This works
   in situations where perfect information about the future is
   available.  In situations where the operating conditions of the
   network are known in advance and remain static throughout the
   lifetime of the system, it is possible to choose fixed optimal values
   for parameters such as stabilization rate, neighborhood set size and
   routing table size.  However, if the operating conditions (e.g., the
   size of the overlay and its churn rate) do not remain static but
   evolve with time, it is not possible to achieve both a low lookup
   failure rate and a low communication overhead by using fixed
   parameters [ghinita2006].

   As an example, to configure the Chord DHT algorithm, one needs to
   select values for the following parameters: size of successor list,
   stabilization interval, and size of the finger table.  To select an
   appropriate value for the stabilization interval, one needs to know
   the expected churn rate and overlay size.  According to
   [liben-nowell2002], a Chord network in a ring-like state remains in a
   ring-like state as long as peers send Omega(square(log(N))) messages
   before N new peers join or N/2 peers fail.  Thus, in a 500-peer
   overlay churning at a rate such that one peer joins and one peer
   leaves the network every 30 seconds, an appropriate stabilization
   interval would be on the order of 93s.  According to [Chord], the
   size of the successor list and finger table should be on the order of
   log(N).  Already a successor list of a modest size (e.g., log2(N) or
   2*log2(N), which is the successor list size used in [Chord]) makes it
   very unlikely that a peer will lose all of its successors, which
   would cause the Chord ring to become disconnected.  Thus, in a
   500-peer network each peer should maintain on the order of nine
   successors and fingers.  However, if the churn rate doubles and the
   network size remains unchanged, the stabilization rate should double
   as well.  That is, the appropriate maintenance interval would now be
   on the order of 46s.  On the other hand, if the churn rate becomes

Maenpaa & Camarillo     Expires December 28, 2014               [Page 7]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   quot; will refer to a unidirectional message
   to prove liveliness.  Likewise, the term "keepalive" will refer to a
   bidirectional message.

   The problem with current heartbeat and keepalive proposals is their
   reliance upon their messages to be sent at regular intervals.  In the
   implementation, this translates into managing some timer to service
   these message intervals.  Similarly, because rapid detection of the
   dead peer is often desired, these messages must be sent with some
   frequency, again translating into considerable overhead for message
   processing.  In implementations and installations where managing
   large numbers of simultaneous IKE sessions is of concern, these
   regular heartbeats/keepalives prove to be infeasible.

   To this end, a number of vendors have implemented their own approach
   to detect peer liveliness without needing to send messages at regular
   intervals.  This informational document describes the current
   practice of those implementations.  This scheme, called Dead Peer
   Detection (DPD), relies on IKE Notify messages to query the
   liveliness of an IKE peer.

Huang, et al.                Informational                      [Page 2]
RFC 3706                Detecting Dead IKE Peers           February 2004

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [1].

2.  Document Roadmap

   As mentioned above, there are already proposed solutions to the
   problem of detecting dead peers.  Section 3 elaborates the rationale
   for using an IKE message exchange to query a peer's liveliness.
   Section 4 examines a keepalives-based approach as well as a
   heartbeats-based approach.  Section 5 presents the DPD proposal
   fully, highlighting differences between DPD and the schemes presented
   in Section 4 and emphasizing scalability issues.  Section 6 examines
   security issues surrounding replayed messages and false liveliness.

3.  Rationale for Periodic Message Exchange for Proof of Liveliness

   As the introduction mentioned, it is often necessary to detect that a
   peer is unreachable as soon as possible.  IKE provides no way for
   this to occur -- aside from waiting until the rekey period, then
   attempting (and failing the rekey).  This would result in a period of
   loss connectivity lasting the remainder of the lifetime of the
   security association (SA), and in most deployments, this is
   unacceptable.  As such, a method is needed for checking up on a
   peer's state at will.  Different methods have arisen, usually using
   an IKE Notify to query the peer's liveliness.  These methods rely on
   either a bidirectional "keepalive" message exchange (a HELLO followed
   by an ACK), or a unidirectional "heartbeat" message exchange (a HELLO
   only).  The next section considers both of these schemes.

4.  Keepalives vs. Heartbeats

4.1.  Keepalives:

   Consider a keepalives scheme in which peer A and peer B require
   regular acknowledgements of each other's liveliness.  The messages
   are exchanged by means of an authenticated notify payload.  The two
   peers must agree upon the interval at which keepalives are sent,
   meaning that some negotiation is required during Phase 1.  For any
   prompt failover to be possible, the keepalives must also be sent at
   rather frequent intervals -- around 10 seconds or so.  In this
   hypothetical keepalives scenario, peers A and B agree to exchange
   keepalives every 10 seconds.  Essentially, every 10 seconds, one peer
   must send a HELLO to the other.  This HELLO serves as proof of
   liveliness for the sending entity.  In turn, the other peer must
   acknowledge each keepalive HELLO.  If the 10 seconds elapse, and one
   side has not received a HELLO, it will send the HELLO message itself,
   using the peer's ACK as proof of liveliness.  Receipt of either a

Huang, et al.                Informational                      [Page 3]
RFC 3706                Detecting Dead IKE Peers           February 2004

   HELLO or ACK causes an entity's keepalive timer to reset. Failure to
   receive an ACK in a certain period of time signals an error.  A
   clarification is presented below:

   Scenario 1:
   Peer A's 10-second timer elapses first, and it sends a HELLO to B.
   B responds with an ACK.

   Peer A:                              Peer B:
   10 second timer fires;  ------>
   wants to know that B is alive;
   sends HELLO.
                                      Receives HELLO; acknowledges
                                      A's liveliness;
                            <------   resets keepalive timer, sends
                                      ACK.
   Receives ACK as proof of
   B's liveliness; resets timer.

   Scenario 2:
   Peer A's 10-second timer elapses first, and it sends a HELLO to B.
   B fails to respond.  A can retransmit, in case its initial HELLO is
   lost.  This situation describes how peer A detects its peer is dead.

   Peer A:                              Peer B (dead):

   10 second timer fires;  ------X
   wants to know that B is
   alive; sends HELLO.

   Retransmission timer    ------X
   expires; initial message
   could have been lost in
   transit; A increments
   error counter and
   sends another HELLO.

   ---

   After some number of errors, A assumes B is dead; deletes SAs and
   possibly initiates failover.

   An advantage of this scheme is that the party interested in the other
   peer's liveliness begins the message exchange.  In Scenario 1, peer A
   is interested in peer B's liveliness, and peer A consequently sends

Huang, et al.                Informational                      [Page 4]
RFC 3706                Detecting Dead IKE Peers           February 2004

   the HELLO.  It is conceivable in such a scheme that peer B would
   never be interested in peer A's liveliness.  In such a case, the onus
   would always lie on peer A to initiate the exchange.

4.2.  Heartbeats:

   By contrast, consider a proof-of-liveliness scheme involving
   unidirectional (unacknowledged) messages.  An entity interested in
   its peer's liveliness would rely on the peer itself to send periodic
   messages demonstrating liveliness.  In such a scheme, the message
   exchange might look like this:

   Scenario 3: Peer A and Peer B are interested in each other's
   liveliness.  Each peer depends on the other to send periodic HELLOs.

   Peer A:                              Peer B:
   10 second timer fires;  ------>
   sends HELLO.  Timer also
   signals expectation of
   B's HELLO.
                                         Receives HELLO as proof of A's
                                         liveliness.

                               <------   10 second timer fires; sends
                                         HELLO.
   Receives HELLO as proof
   of B's liveliness.

   Scenario 4:
   Peer A fails to receive HELLO from B and marks the peer dead.  This
   is how an entity detects its peer is dead.

   Peer A:                              Peer B (dead):
   10 second timer fires;  ------X
   sends HELLO.  Timer also
   signals expectation of
   B's HELLO.

   ---

   Some time passes and A assumes B is dead.

   The disadvantage of this scheme is the reliance upon the peer to
   demonstrate liveliness.  To this end, peer B might never be
   interested in peer A&e.g. six-fold and the size of the network grows to 2000 peers, on the
   order of eleven fingers and successors should be maintained and the
   stabilization interval should be on the order of 42s.  If one
   continued using the old values, this could result in inaccurate
   routing tables, network partitioning, and deteriorating performance.

3.3.  Adaptive Stabilization

   A self-tuning DHT takes into consideration the continuous evolution
   of network conditions and adapts to them.  In a self-tuning DHT, each
   peer collects statistical data about the network and dynamically
   adjusts its stabilization rate, neighborhood set size, and finger
   table size based on the analysis of the data [ghinita2006].
   Reference [mahajan2003] shows that by using self-tuning, it is
   possible to achieve high reliability and performance even in adverse
   conditions with low maintenance cost.  Adaptive stabilization has
   been shown to outperform periodic stabilization in terms of both
   lookup failures and communication overhead [ghinita2006].

4.  Introduction to Chord

   Chord [Chord] is a structured P2P algorithm that uses consistent
   hashing to build a DHT out of several independent peers.  Consistent
   hashing assigns each peer and resource a fixed-length identifier.
   Peers use SHA-1 as the base hash fuction to generate the identifiers.
   As specified in RELOAD base, the length of the identifiers is
   numBitsInNodeId=128 bits.  The identifiers are ordered on an
   identifier circle of size 2^numBitsInNodeId.  On the identifier
   circle, key k is assigned to the first peer whose identifier equals
   or follows the identifier of k in the identifier space.  The
   identifier circle is called the Chord ring.

   Different DHTs differ significantly in performance when bandwidth is
   limited.  It has been shown that when compared to other DHTs, the
   advantages of Chord include that it uses bandwidth efficiently and
   can achieve low lookup latencies at little cost [li2004].

   A simple lookup mechanism could be implemented on a Chord ring by
   requiring each peer to only know how to contact its current successor
   on the identifier circle.  Queries for a given identifier could then
   be passed around the circle via the successor pointers until they
   encounter the first peer whose identifier is equal to or larger than
   the desired identifier.  Such a lookup scheme uses a number of
   messages that grows linearly with the number of peers.  To reduce the
   cost of lookups, Chord maintains also additional routing information;
   each peer n maintains a data structure with up to numBitsInNodeId
   entries, called the finger table.  The first entry in the finger
   table of peer n contains the peer half-way around the ring from peer

Maenpaa & Camarillo     Expires December 28, 2014               [Page 8]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   n.  The second entry contains the peer that is 1/4th of the way
   around, the third entry the peer that is 1/8th of the way around,
   etc.  In other words, the ith entry in the finger table at peer n
   contains the identity of the first peer s that succeeds n by at least
   2^(numBitsInNodeId-i) on the Chord ring.  This peer is called the ith
   finger of peer n.  The interval between two consecutive fingers is
   called a finger interval.  The ith finger interval of peer n covers
   the range [n.id + 2^(numBitsInNodeId-i), n.id + 2^(numBitsInNodeId-
   i+1)) on the Chord ring.  In an N-peer network, each peer maintains
   information about O(log(N)) other peers in its finger table.  As an
   example, if N=100000, it is sufficient to maintain 17 fingers.

   Chord needs all peers' successor pointers to be up to date in order
   to ensure that lookups produce correct results as the set of
   participating peers changes.  To achieve this, peers run a
   stabilization protocol periodically in the background.  The
   stabilization protocol of the original Chord algorithm uses two
   operations: successor stabilization and finger stabilization.
   However, the Chord algorithm of RELOAD base defines two additional
   stabilization components, as will be discussed below.

   To increase robustness in the event of peer failures, each Chord peer
   maintains a successor list of size r, containing the peer's first r
   successors.  The benefit of successor lists is that if each peer
   fails independently with probability p, the probability that all r
   successors fail simultaneously is only p^r.

   The original Chord algorithm maintains only a single predecessor
   pointer.  However, multiple predecessor pointers (i.e., a predecessor
   list) can be maintained to speed up recovery from predecessor
   failures.  The routing table of a peer consists of the successor
   list, finger table, and predecessor list.

5.  Extending Chord-reload to Support Self-tuning

   This section describes how the mandatory-to-implement chord-reload
   algorithm defined in RELOAD base [RFC6940] can be extended to support
   self-tuning.

   The chord-reload algorithm supports both reactive and periodic
   recovery strategies.  When the self-tuning mechanisms defined in this
   document are used, the periodic recovery strategy is used.  Further,
   chord-reload specifies that at least three predecessors and three
   successors need to be maintained.  When the self-tuning mechanisms
   are used, the appropriate sizes of the successor list and predecessor
   list are determined in an adaptive fashion based on the estimated
   network size, as will be described in Section 6.

Maenpaa & Camarillo     Expires December 28, 2014               [Page 9]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   As specified in RELOAD base, each peer maintains a stabilization
   timer.  When the stabilization timer fires, the peer restarts the
   timer and carries out the overlay stabilization routine.  Overlay
   stabilization has four components in chord-reload:

   1.  Update the neighbor table.  We refer to this as neighbor
       stabilization.

   2.  Refreshing the finger table.  We refer to this as finger
       stabilization.

   3.  Adjusting finger table size.

   4.  Detecting partitioning.  We refer to this as strong
       stabilization.

   As specified in RELOAD base [RFC6940], a peer sends periodic messages
   as part of the neighbor stabilization, finger stabilization, and
   strong stabilization routines.  In neighbor stabilization, a peer
   periodically sends an Update request to every peer in its Connection
   Table.  The default time is every ten minutes.  In finger
   stabilization, a peer periodically searches for new peers to include
   in its finger table.  This time defaults to one hour.  This document
   specifies how the neighbor stabilization and finger stabilization
   intervals can be determined in an adaptive fashion based on the
   operating conditions of the overlay.  The subsections below describe
   how this document extends the four components of stabilization.

5.1.  Update Requests

   As described in RELOAD base [RFC6940], the neighbor and finger
   stabilization procedures are implemented using Update requests.
   RELOAD base defines three types of Update requests: 'peer_ready',
   'neighbors', and 'full'.  Regardless of the type, all Update requests
   include an 'uptime' field.  The self-tuning extensions require
   information on the uptimes of peers in the routing table.  The sender
   of an Update request includes its current uptime (in seconds) in the
   'uptime' field.  Regardless of the type, all Update requests MUST
   include an 'uptime' field.

   When self-tuning is used, each peer decides independently the
   appropriate size for the successor list, predecessor list and finger
   table.  Thus, the 'predecessors', 'successors', and 'fingers' fields
   included in RELOAD Update requests are of variable length.  As
   specified in RELOAD [RFC6940], variable length fields are on the wire
   preceded by length bytes.  In the case of the successor list,
   predecessor list, and finger table, there are two length bytes
   (allowing lengths up to 2^16-1).  The number of NodeId structures

Maenpaa & Camarillo     Expires December 28, 2014              [Page 10]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   included in each field can be calculated based on the length bytes
   since the size of a single NodeId structure is 16 bytes.  If a peer
   receives more entries than fit into its successor list, predecessor
   list or finger table, the peer MUST ignore the extra entries.  A peer
   may also receive less entries than it currently has in its own data
   structure.  In that case, it uses the received entries to update only
   a subset of the entries in its data structure.  As an example, a peer
   that has a successor list of size 8 may receive a successor list of
   size 4 from its immediate successor.  In that case, the received
   successor list can only be used to update the first few successors on
   the peer's successor list.  The rest of the successors will remain
   intact.

5.2.  Neighbor Stabilization

   In the neighbor stabilization operation of chord-reload, a peer
   periodically sends an Update request to every peer in its Connection
   Table.  In a small, low-churn overlay, the amount of traffic this
   process generates is typically acceptable.  However, in a large-scale
   overlay churning at a moderate or high churn rate, the traffic load
   may no longer be acceptable since the size of the connection table is
   large and the stabilization interval relatively short.  The self-
   tuning mechanisms described in this document are especially designed
   for overlays of the latter type.  Therefore, when the self-tuning
   mechanisms are used, each peer only sends a periodic Update request
   to its first predecessor and first successor on the Chord ring; it
   MUST NOT send Update requests to others.

   The neighbor stabilization routine is executed when the stabilization
   timer fires.  To begin the neighbor stabilization routine, a peer
   sends an Update request to its first successor and its first
   predecessor.  The type of the Update request MUST be 'neighbors'.
   The Update request includes the successor and predecessor lists of
   the sender.  If a peer receiving such an Update request learns from
   the predecessor and successor lists included in the request that new
   peers can be included in its neighborhood set, it sends Attach
   requests to the new peers.

   After a new peer has been added to the predecessor or successor list,
   an Update request of type 'peer_ready' is sent to the new peer.  This
   allows the new peer to insert the sender into its neighborhood set.

5.3.  Finger Stabilization

   Chord-reload specifies two alternative methods for searching for new
   peers to the finger table.  Both of the alternatives can be used with
   the self-tuning extensions defined in this document.

Maenpaa & Camarillo     Expires December 28, 2014              [Page 11]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   Immediately after a new peer has been added to the finger table, a
   Probe request is sent to the new peer to fetch its uptime.  The
   requested_info field of the Probe request MUST be set to contain the
   ProbeInformationType 'uptime's liveliness.  Nonetheless, if A is interested
   B's liveliness, B must be aware of this, and maintain the necessary
   state information to send periodic HELLOs to A.  The disadvantage of

Huang, et al.                Informational                      [Page 5]
RFC 3706                Detecting Dead IKE Peers           February 2004

   such a scheme becomes clear in the remote-access scenario.  Consider
   a VPN aggregator that terminates a large number of sessions (on the
   order of 50,000 peers or so).  Each peer requires fairly rapid
   failover, therefore requiring the aggregator to send HELLO packets
   every 10 seconds or so.  Such a scheme simply lacks scalability, as
   the aggregator must send 50,000 messages every few seconds.

   In both of these schemes (keepalives and heartbeats), some
   negotiation of message interval must occur, so that each entity can
   know how often its peer expects a HELLO.  This immediately adds a
   degree of complexity.  Similarly, the need to send periodic messages
   (regardless of other IPSec/IKE activity), also increases
   computational overhead to the system.

5.  DPD Protocol

   DPD addresses the shortcomings of IKE keepalives- and heartbeats-
   schemes by introducing a more reasonable logic governing message
   exchange.  Essentially, keepalives and heartbeats mandate exchange of
   HELLOs at regular intervals.  By contrast, with DPD, each peer's DPD
   state is largely independent of the other's.  A peer is free to
   request proof of liveliness when it needs it -- not at mandated
   intervals.  This asynchronous property of DPD exchanges allows fewer
   messages to be sent, and this is how DPD achieves greater
   scalability.

   As an elaboration, consider two DPD peers A and B.  If there is
   ongoing valid IPSec traffic between the two, there is little need for
   proof of liveliness.  The IPSec traffic itself serves as the proof of
   liveliness.  If, on the other hand, a period of time lapses during
   which no packet exchange occurs, the liveliness of each peer is
   questionable.  Knowledge of the peer's liveliness, however, is only
   urgently necessary if there is traffic to be sent.  For example, if
   peer A has some IPSec packets to send after the period of idleness,
   it will need to know if peer B is still alive.  At this point, peer A
   can initiate the DPD exchange.

   To this end, each peer may have different requirements for detecting
   proof of liveliness.  Peer A, for example, may require rapid
   failover, whereas peer B's requirements for resource cleanup are less
   urgent.  In DPD, each peer can define its own "worry metric" - an
   interval that defines the urgency of the DPD exchange. Continuing the
   example, peer A might define its DPD interval to be 10 seconds.
   Then, if peer A sends outbound IPSec traffic, but fails to receive
   any inbound traffic for 10 seconds, it can initiate a DPD exchange.

Huang, et al.                Informational                      [Page 6]
RFC 3706                Detecting Dead IKE Peers           February 2004

   Peer B, on the other hand, defines its less urgent DPD interval to be
   5 minutes.  If the IPSec session is idle for 5 minutes, peer B can
   initiate a DPD exchange the next time it sends IPSec packets to A.

   It is important to note that the decision about when to initiate a
   DPD exchange is implementation specific.  An implementation might
   even define the DPD messages to be at regular intervals following
   idle periods.  See section 5.5 for more implementation suggestions.

5.1.  DPD Vendor ID

   To demonstrate DPD capability, an entity must send the DPD vendor ID.
   Both peers of an IKE session MUST send the DPD vendor ID before DPD
   exchanges can begin.  The format of the DPD Vendor ID is:

                                     1
                0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                !                           !M!M!
                !      HASHED_VENDOR_ID     !J!N!
                !                           !R!R!
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   where HASHED_VENDOR_ID = {0xAF, 0xCA, 0xD7, 0x13, 0x68, 0xA1, 0xF1,
   0xC9, 0x6B, 0x86, 0x96, 0xFC, 0x77, 0x57}, and MJR and MNR correspond
   to the current major and minor version of this protocol (1 and 0
   respectively).  An IKE peer MUST send the Vendor ID if it wishes to
   take part in DPD exchanges.

5.2.  Message Exchanges

   The DPD exchange is a bidirectional (HELLO/ACK) Notify message.  The
   exchange is defined as:

            Sender                                      Responder
           --------                                    -----------
   HDR*, NOTIFY(R-U-THERE), HASH   ------>

                                 <------    HDR*, NOTIFY(R-U-THERE-
                                            ACK), HASH

Huang, et al.                Informational                      [Page 7]
RFC 3706                Detecting Dead IKE Peers           February 2004

   The R-U-THERE message corresponds to a "HELLO" and the R-U-THERE-ACK
   corresponds to an "ACK."  Both messages are simply ISAKMP Notify
   payloads, and as such, this document defines these two new ISAKMP
   Notify message types:

      Notify                      Message Value
      R-U-THERE                   36136
      R-U-THERE-ACK               36137

   An entity that has sent the DPD Vendor ID MUST respond to an R-U-
   THERE query.  Furthermore, an entity MUST reject unencrypted R-U-
   THERE and R-U-THERE-ACK messages.

5.3.  NOTIFY(R-U-THERE/R-U-THERE-ACK) Message Format

   When sent, the R-U-THERE message MUST take the following form:

                       1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ! Next Payload  !   RESERVED    !         Payload Length        !
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   !              Domain of Interpretation  (DOI)                  !
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   !  Protocol-ID  !    SPI Size   !      Notify Message Type      !
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   !                                                               !
   ~                Security Parameter Index (SPI)                 ~
   !                                                               !
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   !                    Notification Data                          !
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   As this message is an ISAKMP NOTIFY, the Next Payload, RESERVED, and
   Payload Length fields should be set accordingly.  The remaining
   fields are set as:

   -  Domain of Interpretation (4 octets) - SHOULD be set to IPSEC-DOI.

   -  Protocol ID (1 octet) - MUST be set to the protocol ID for ISAKMP.

   -  SPI Size (1 octet) - SHOULD be set to sixteen (16), the length of
      two octet-sized ISAKMP cookies.

   -  Notify Message Type (2 octets) - MUST be set to R-U-THERE

Huang, et al.                Informational                      [Page 8]
RFC 3706                Detecting Dead IKE Peers           February 2004

   -  Security Parameter Index (16 octets) - SHOULD be set to the
      cookies of the Initiator and Responder of the IKE SA (in that
      order)

   -  Notification Data (4 octets) - MUST be set to the sequence number
      corresponding to this message

   The format of the R-U-THERE-ACK message is the same, with the
   exception that the Notify Message Type MUST be set to R-U-THERE-ACK.
   Again, the Notification Data MUST be sent to the sequence number
   corresponding to the received R-U-THERE message.

5.4.  Impetus for DPD Exchange

   Again, rather than relying on some negotiated time interval to force
   the exchange of messages, DPD does not mandate the exchange of R-U-
   THERE messages at any time.  Instead, an IKE peer SHOULD send an R-
   U-THERE query to its peer only if it is interested in the liveliness
   of this peer.  To this end, if traffic is regularly exchanged between
   two peers, either peer SHOULD use this traffic as proof of
   liveliness, and both peers SHOULD NOT initiate a DPD exchange.

   A peer MUST keep track of the state of a given DPD exchange.  That
   is, once it has sent an R-U-THERE query, it expects an ACK in
   response within some implementation-defined period of time.  An
   implementation SHOULD retransmit R-U-THERE queries when it fails to
   receive an ACK.  After some number of retransmitted messages, an
   implementation SHOULD assume its peer to be unreachable and delete
   IPSec and IKE SAs to the peer.

5.5.  Implementation Suggestion

   Since the liveliness of a peer is only questionable when no traffic
   is exchanged, a viable implementation might begin by monitoring
   idleness.  Along these lines, a peer' defined in RELOAD base [RFC6940].

5.4.  Adjusting Finger Table Size

   The chord-reload algorithm defines how a peer can make sure that the
   finger table is appropriately sized to allow for efficient routing.
   Since the self-tuning mechanisms specified in this document produce a
   network size estimate, this estimate can be directly used to
   calculate the optimal size for the finger table.  This mechanism is
   used instead of the one specified by chord-reload.  A peer uses the
   network size estimate to determine whether it needs to adjust the
   size of its finger table each time when the stabilization timer
   fires.  The way this is done is explained in Section 6.2.

5.5.  Detecting Partitioning

   This document does not require any changes to the mechanism chord-
   reload uses to detect network partitioning.

5.6.  Leaving the Overlay

   As specified in RELOAD base [RFC6940], a leaving peer SHOULD send a
   Leave request to all members of its neighbor table prior to leaving
   the overlay.  The overlay_specific_data field MUST contain the
   ChordLeaveData structure.  The Leave requests that are sent to
   successors contain the predecessor list of the leaving peer.  The
   Leave requests that are sent to the predecessors contain the
   successor list of the leaving peer.  If a given successor can
   identify better predecessors (that is, predecessors that are closer
   to it on the Chord ring than its existing predecessors) than are
   already included in its predecessor lists by investigating the
   predecessor list it receives from the leaving peer, it sends Attach
   requests to them.  Similarly, if a given predecessor identifies
   better successors by investigating the successor list it receives
   from the leaving peer, it sends Attach requests to them.

6.  Self-tuning Chord Parameters

   This section specifies how to determine an appropriate stabilization
   rate and routing table size in an adaptive fashion.  The proposed
   mechanism is based on [mahajan2003], [liben-nowell2002], and
   [ghinita2006].  To calculate an appropriate stabilization rate, the
   values of three parameters must be estimated: overlay size N, failure
   rate U, and join rate L.  To calculate an appropriate routing table
   size, the estimated network size N can be used.  Peers in the overlay

Maenpaa & Camarillo     Expires December 28, 2014              [Page 12]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   MUST re-calculate the values of the parameters to self-tune the
   chord-reload algorithm at the end of each stabilization period before
   re-starting the stabilization timer.

6.1.  Estimating Overlay Size

   Techniques for estimating the size of an overlay network have been
   proposed for instance in [mahajan2003], [horowitz2003],
   [kostoulas2005], [binzenhofer2006], and [ghinita2006].  In Chord, the
   density of peer identifiers in the neighborhood set can be used to
   produce an estimate of the size of the overlay, N [mahajan2003].
   Since peer identifiers are picked randomly with uniform probability
   from the numBitsInNodeId-bit identifier space, the average distance
   between peer identifiers in the successor set is
   (2^numBitsInNodeId)/N.

   To estimate the overlay network size, a peer computes the average
   inter-peer distance d between the successive peers starting from the
   most distant predecessor and ending to the most distant successor in
   the successor list.  The estimated network size is calculated as:

                         2^numBitsInNodeId
                    N = -------------------
                                d

   This estimate has been found to be accurate within 15% of the real
   network size [ghinita2006].  Of course, the size of the neighborhood
   set affects the accuracy of the estimate.

   During the join process, a joining peer fills its routing table by
   sending a series of Ping and Attach requests, as specified in RELOAD
   base [RFC6940].  Thus, a joining peer immediately has enough
   information at its disposal to calculate an estimate of the network
   size.

6.2.  Determining Routing Table Size

   As specified in RELOAD base, the finger table must contain at least
   16 entries.  When the self-tuning mechanisms are used, the size of
   the finger table MUST be set to max(ceiling(log2(N)), 16) using the
   estimated network size N.

   The size of the successor list MUST be set to a maximum of
   ceiling(log2(N)).  An implementation can place a lower limit on the
   size of the successor list.  As an example, the implementation might
   require the size of the successor list to be always at least three.

   The size of the predecessor list MUST be set to ceiling(log2(N)).

Maenpaa & Camarillo     Expires December 28, 2014              [Page 13]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

6.3.  Estimating Failure Rate

   A typical approach is to assume that peers join the overlay according
   to a Poisson process with rate L and leave according to a Poisson
   process with rate parameter U [mahajan2003].  The value of U can be
   estimated using peer failures in the finger table and neighborhood
   set [mahajan2003].  If peers fail with rate U, a peer with M unique
   peer identifiers in its routing table should observe K failures in
   time K/(M*U).  Every peer in the overlay maintains a history of the
   last K failures.  The current time is inserted into the history when
   the peer joins the overlay.  The estimate of U is calculated as:

                             k
                     U = --------,
                          M * Tk

   where M is the number of unique peer identifiers in the routing
   table, Tk is the time between the first and the last failure in the
   history, and k is the number of failures in the history.  If k is
   smaller than K, the estimate is computed as if there was a failure at
   the current time.  It has been shown that an estimate calculated in a
   similar manner is accurate within 17% of the real value of U
   [ghinita2006].

   The size of the failure history K affects the accuracy of the
   estimate of U.  One can increase the accuracy by increasing K.
   However, this has the side effect of decreasing responsiveness to
   changes in the failure rate.  On the other hand, a small history size
   may cause a peer to overreact each time a new failure occurs.  In
   [ghinita2006], K is set to 25% of the routing table size.  Use of
   this value is RECOMMENDED.

6.3.1.  Detecting Failures

   A new failure is inserted to the failure history in the following
   cases:

   1.  A Leave request is received from a neigbhor.

   2.  A peer fails to reply to a Ping request sent in the situation
       explained below.  If no packets have been received on a
       connection during the past 2*Tr seconds (where Tr is the
       inactivity timer defined by ICE [RFC5245]), a RELOAD Ping request
       MUST be sent to the remote peer.  RELOAD mandates the use of STUN
       [RFC5389] for keepalives.  STUN keepalives take the form of STUN
       Binding Indication transactions.  As specified in ICE [RFC5245],
       a peer sends a STUN Binding Indication if there has been no
       packet sent on a connection for Tr seconds.  Tr is configurable

Maenpaa & Camarillo     Expires December 28, 2014              [Page 14]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

       and has a default of 15 seconds.  Although STUN Binding
       Indications do not generate a response, the fact that a peer has
       failed can be learned from the lack of packets (Binding
       Indications or application protocol packets) received from the
       peer.  If the remote peer fails to reply to the Ping request, the
       sender should consider the remote peer to have failed.

   As an alternative to relying on STUN keepalives to detect peer
   failure, a peer could send additional, frequent RELOAD messages to
   every peer in its Connection Table.  These messages could be Update
   requests, in which case they would serve two purposes: detecting peer
   failure and stabilization.  However, as the cost of this approach can
   be very high in terms of bandwidth consumption and traffic load,
   especially in large-scale overlays experiencing churn, its use is NOT
   RECOMMENDED.

6.4.  Estimating Join Rate

   Reference [ghinita2006] proposes that a peer can estimate the join
   rate based on the uptime of the peers in its routing table.  An
   increase in peer join rate will be reflected by a decrease in the
   average age of peers in the routing table.  Thus, each peer maintaind
   an array of the ages of the peers in its routing table sorted in
   increasing order.  Using this information, an estimate of the global
   peer join rate L is calculated as:

                                  N
                    L = ----------------------,
                         Ages[floor(rsize/2)]

   where Ages is an array containing the ages of the peers in the
   routing table sorted in increasing order and rsize is the size of the
   routing table.  It has been shown that the estimate obtained by using
   this method is accurate within 22% of the real join rate
   [ghinita2006].  Of course, the size of the routing table affects the
   accuracy.

   In order for this mechanism to work, peers need to exchange
   information about the time they have been present in the overlay.
   Peers receive the uptimes of their successors and predecessors during
   the stabilization operations since all Update requests carry uptime
   values.  A joining peer learns the uptime of the admitting peer since
   it receives an Update from the admitting peer during the join
   procedure.  Peers learn the uptimes of new fingers since they can
   fetch the uptime using a Probe request after having attached to the
   new finger.

Maenpaa & Camarillo     Expires December 28, 2014              [Page 15]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

6.5.  Estimate Sharing

   To improve the accuracy of network size, join rate, and leave rate
   estimates, peers share their estimates.  When the stabilization timer
   fires, a peer selects number-of-peers-to-probe random peers from its
   finger table and send each of them a Probe request.  The targets of
   Probe requests are selected from the finger table rather than from
   the neighbor table since neighbors are likely to make similar errors
   when calculating their estimates. number-of-peers-to-probe is a new
   element in the overlay configuration document.  It is defined in
   Section 7.  Both the Probe request and the answer returned by the
   target peer MUST contain a new message extension whose
   MessageExtensionType is 'self_tuning_data'.  This extension type is
   defined in Section 9.1.  The extension_contents field of the
   MessageExtension structure MUST contain a SelfTuningData structure:

               struct {
                 uint32                   network_size;
                 uint32                   join_rate;
                 uint32                   leave_rate;
               } SelfTuningData;

   The contents of the SelfTuningData structure are as follows:

   network_size

      The latest network size estimate calculated by the sender.

   join_rate

      The latest join rate estimate calculated by the sender.

   leave_rate

      The latest leave rate estimate calculated by the sender.

   The join and leave rates are expressed as joins or failures per 24
   hours.  As an example, if the global join rate estimate a peer has
   calculated is 0.123 peers/s, it would include in the join_rate field
   the ceiling of the value 10627.2 (24*60*60*0.123 = 10627.2), that is,
   the value 10628.

Maenpaa & Camarillo     Expires December 28, 2014              [Page 16]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   The 'type' field of the MessageExtension structure MUST be set to
   contain the value 'self_tuning_data'.  The 'critical' field of the
   structure MUST be set to False.

   A peer stores all estimates it receives in Probe requests and answers
   during a stabilization interval.  When the stabilization timer fires,
   the peer calculates the estimates to be used during the next
   stabilization interval by taking the 75th percentile (i.e., third
   quartile) of a data set containing its own estimate and the received
   estimates.

   The default value for number-of-peers-to-probe is 4.  This default
   value is recommended to allow a peer to receive a sufficiently large
   set of estimates from other peers.  With a value of 4, a peer
   receives four estimates in Probe answers.  On the average, each peer
   also receives four Probe requests each carrying an estimate.  Thus,
   on the average, each peer has nine estimates (including its own) that
   it can use at the end of the stablization interval.  A value smaller
   than 4 is NOT RECOMMENDED to keep the number of received estimates
   high enough.  As an example, if the value were 2, there would be
   peers in the overlay that would only receive two estimates during a
   stabilization interval.  Such peers would only have three estimates
   available at the end of the interval, which may not be reliable
   enough since even a single exceptionally high or low estimate can
   have a large impact.

6.6.  Calculating the Stabilization Interval

   According to [liben-nowell2002], a Chord network in a ring-like state
   remains in a ring-like state as long as peers send
   Omega(square(log(N))) messages before N new peers join or N/2 peers
   fail.  We can use the estimate of peer failure rate, U, to calculate
   the time Tf in which N/2 peers fail:

                                  1
                           Tf = ------
                                 2*U

   Based on this estimate, a stabilization interval Tstab-1 is
   calculated as:

                                           Tf
                           Tstab-1 = -----------------
                                      square(log2(N))

   On the other hand, the estimated join rate L can be used to calculate
   the time in which N new peers join the overlay.  Based on the
   estimate of L, a stabilization interval Tstab-2 is calculated as:

Maenpaa & Camarillo     Expires December 28, 2014              [Page 17]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

                                               N
                            Tstab-2 = ---------------------
                                       L * square(log2(N))

   Finally, the actual stabilization interval Tstab that is used can be
   obtained by taking the minimum of Tstab-1 and Tstab-2.

   The results obtained in [maenpaa2009] indicate that making the
   stabilization interval too small has the effect of making the overlay
   less stable (e.g., in terms of detected loops and path failures).
   Thus, a lower limit should be used for the stabilization period.
   Based on the results in [maenpaa2009], a lower limit of 15s is
   RECOMMENDED, since using a stabilization period smaller than this
   will with a high probability cause too much traffic in the overlay.

7.  Overlay Configuration Document Extension

   This document extends the RELOAD overlay configuration document by
   adding one new element, "number-of-peers-to-probe", inside each
   "configuration" element.

   self-tuning:number-of-peers-to-probe:  The number of fingers to which
      Probe requests are sent to obtain their network size, join rate,
      and leave rate estimates.  The default value is 4.

   The Relax NG Grammar for this element is:

   namespace self-tuning = "urn:ietf:params:xml:ns:p2p:self-tuning"

   parameter &= element self-tuning:number-of-peers-to-probe {
   xsd:unsignedInt }?

   This namespace is added into the <mandatory-extension> element in the
   overlay configuration file.

8.  Security Considerations

   In the same way as malicious or compromised peers implementing the
   RELOAD base protocol [RFC6940] can advertise false network metrics or
   distribute false routing table information for instance in RELOAD
   Update messages, malicious peers implementing this specification may
   share false join rate, leave rate, and network size estimates.  For
   such attacks, the same security concerns apply as in the RELOAD base
   specification.  In addition, as long as the amount of malicious peers
   in the overlay remains modest, the statistical mechanisms applied in
   Section 6.5 (i.e., the use of 75th percentiles) to process the shared
   estimates a peer obtains help ensure that estimates that are clearly
   different from (i.e., larger or smaller than) other received

Maenpaa & Camarillo     Expires December 28, 2014              [Page 18]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   estimates will not significantly influence the process of adapting
   the stabilization interval and routing table size.  However, it
   should be noted that if an attacker is able to impersonate a high
   number of other peers in the overlay in strategic locations, it may
   be able to send a high enough number of false estimates to a victim
   and therefore influence the victim's choice of a stabilization
   interval.

9.  IANA Considerations

9.1.  Message Extensions

   This document introduces one additional extension to the "RELOAD
   Extensions" Registry:

                  +------------------+-------+---------------+
                  | Extension Name   |  Code | Specification |
                  +------------------+-------+---------------+
                  | self_tuning_data |   0x3 |      RFC-AAAA |
                  +------------------+-------+---------------+

   The contents of the extension are defined in Section 6.5.

   Note to RFC Editor: please replace AAAA with the RFC number for this
   specification.

9.2.  New Overlay Algorithm Type

   This document introduces one additional overlay algorithm type to the
   "RELOAD Overlay Algorithm Types" Registry:

                  +-------------------+-----------+
                  | Algorithm Name    | Reference |
                  +-------------------+-----------+
                  | CHORD-SELF-TUNING | RFC-AAAA  |
                  +-------------------+-----------+

   Note to RFC Editor: please replace AAAA with the RFC number for this
   specification.

9.3.  A New IETF XML Registry

   This document registers one new URI for the self-tuning namespace in
   the "ns" subregistry of the IETF XML registry defined in [RFC3688].

   URI: urn:ietf:params:xml:ns:p2p:self-tuning

Maenpaa & Camarillo     Expires December 28, 2014              [Page 19]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   Registrant Contact: The IESG

   XML: N/A, the requested URI is an XML namespace

10.  Acknowledgments

   The authors would like to thank Jani Hautakorpi for his contributions
   to the document.  The authors would also like to thank Carlos
   Bernardos, Martin Durst, Alissa Cooper, Tobias Gondrom, and Barry
   Leiba for their comments on the document.

11.  References

11.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
              (ICE): A Protocol for Network Address Translator (NAT)
              Traversal for Offer/Answer Protocols", RFC 5245, April
              2010.

   [RFC5389]  Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
              "Session Traversal Utilities for NAT (STUN)", RFC 5389,
              October 2008.

   [RFC6940]  Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and
              H. Schulzrinne, "REsource LOcation And Discovery (RELOAD)
              Base Protocol", RFC 6940, January 2014.

11.2.  Informative References

   [CAN]      Ratnasamy, S., Francis, P., Handley, M., Karp, R., and S.
              Schenker, "A Scalable Content-Addressable Network", In
              Proceedings of the 2001 Conference on Applications,
              Technologies, Architectures and Protocols for Computer
              Communications pp. 161-172, August 2001.

   [Chord]    Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.,
              Kaashoek, M., Dabek, F., and H. Balakrishnan, "Chord: A
              Scalable Peer-to-peer Lookup Service for Internet
              Applications", IEEE/ACM Transactions on Networking Volume
              11, Issue 1, pp. 17-32, February 2003.

Maenpaa & Camarillo     Expires December 28, 2014              [Page 20]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   [Pastry]   Rowstron, A. and P. Druschel, "Pastry: Scalable,
              Decentralized Object Location and Routing for Large-Scale
              Peer-to-Peer Systems", In Proceedings of the IFIP/ACM
              International Conference on Distribued Systems Platforms
              pp. 329-350, November 2001.

   [RFC3688]  Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
              January 2004.

   [binzenhofer2006]
              Binzenhofer, A., Kunzmann, G., and R. Henjes, "A Scalable
              Algorithm to Monitor Chord-Based P2P Systems at Runtime",
              In Proceedings of the 20th IEEE International Parallel and
              Distributed Processing Symposium (IPDPS) pp. 1-8, April
              2006.

   [ghinita2006]
              Ghinita, G. and Y. Teo, "An Adaptive Stabilization
              Framework for Distributed Hash Tables", In Proceedings of
              the 20th IEEE International Parallel and Distributed
              Processing Symposium (IPDPS) pp. 29-38, April 2006.

   [horowitz2003]
              Horowitz, K. and D. Malkhi, "Estimating Network Size from
              Local Information", Information Processing Letters Volume
              88, Issue 5, pp. 237-243, December 2003.

   [kostoulas2005]
              Kostoulas, D., Psaltoulis, D., Gupta, I., Birman, K., and
              A. Demers, "Decentralized Schemes for Size Estimation in
              Large and Dynamic Groups", In Proceedings of the 4th IEEE
              International Symposium on Network Computing and
              Applications pp. 41-48, July 2005.

   [krishnamurthy2008]
              Krishnamurthy, S., El-Ansary, S., Aurell, E., and S.
              Haridi, "Comparing Maintenance Strategies for Overlays",
              In Proceedings of the 16th Euromicro Conference on
              Parallel, Distributed and Network-Based Processing pp.
              473-482, February 2008.

   [li2004]   Li, J., Strinbling, J., Gil, T., Morris, R., and M.
              Kaashoek, "Comparing the Performance of Distributed Hash
              Tables Under Churn", Peer-to-Peer Systems III, volume 3279
              of Lecture Notes in Computer Science Springer, pp. 87-99,
              February 2005.

Maenpaa & Camarillo     Expires December 28, 2014              [Page 21]
Internet-Draft         Self-tuning DHT for RELOAD              June 2014

   [liben-nowell2002]
              Liben-Nowell, D., Balakrishnan, H., and D. Karger,
              "Observations on the Dynamic Evolution of Peer-to-Peer
              Networks", In Proceedings of the 1st International
              Workshop on Peer-to-Peer Systems (IPTPS) pp. 22-33, March
              2002.

   [maenpaa2009]
              Maenpaa, J. and G. Camarillo, "A Study on Maintenance
              Operations in a Chord-Based Peer-to-Peer Session
              Initiation Protocol Overlay Network", In Proceedings of
              the 23rd IEEE International Parallel and Distributed
              Processing Symposium (IPDPS) pp. 1-9, May 2009.

   [mahajan2003]
              Mahajan, R., Castro, M., and A. Rowstron, "Controlling the
              Cost of Reliability in Peer-to-Peer Overlays", In
              Proceedings of the 2nd International Workshop on Peer-to-
              Peer Systems (IPTPS) pp. 21-32, February 2003.

   [rhea2004]
              Rhea, S., Geels, D., Roscoe, T., and J. Kubiatowicz,
              "Handling Churn in a DHT", In Proceedings of the USENIX
              Annual Technical Conference pp. 127-140, June 2004.

   [weiss1998]
              Weiss, M., "Data Structures and Algorithm Analysis in
              C++", Addison-Wesley Longman Publishin Co., Inc. 2nd
              Edition, ISBN:0201361221, 1998.

Authors's liveliness is only important
   when there is outbound traffic to be sent.  To this end, an
   implementation can initiate a DPD exchange (i.e., send an R-U-THERE
   message) when there has been some period of idleness, followed by the
   desire to send outbound traffic.  Likewise, an entity can initiate a
   DPD exchange if it has sent outbound IPSec traffic, but not received
   any inbound IPSec packets in response.  A complete DPD exchange
   (i.e., transmission of R-U-THERE and receipt of corresponding R-U-
   THERE-ACK) will serve as proof of liveliness until the next idle
   period.

   Again, since DPD does not mandate any interval, this "idle period"
   (or "worry metric") is left as an implementation decision.  It is not
   a negotiated value.

Huang, et al.                Informational                      [Page 9]
RFC 3706                Detecting Dead IKE Peers           February 2004

5.6.  Comparisons

   The performance benefit that DPD offers over traditional keepalives-
   and heartbeats-schemes comes from the fact that regular messages do
   not need to be sent.  Returning to the examples presented in section
   4.1, a keepalive implementation such as the one presented would
   require one timer to signal when to send a HELLO message and another
   timer to "timeout" the ACK from the peer (this could also be the
   retransmit timer).  Similarly, a heartbeats scheme such as the one
   presented in section 4.2 would need to keep one timer to signal when
   to send a HELLO, as well as another timer to signal the expectation
   of a HELLO from the peer.  By contrast a DPD scheme needs to keep a
   timestamp to keep track of the last received traffic from the peer
   (thus marking beginning of the "idle period").  Once a DPD R-U-THERE
   message has been sent, an implementation need only maintain a timer
   to signal retransmission.  Thus, the need to maintain active timer
   state is reduced, resulting in a scalability improvement (assuming
   maintaining a timestamp is less costly than an active timer).
   Furthermore, since a DPD exchange only occurs if an entity has not
   received traffic recently from its peer, the number of IKE messages
   to be sent and processed is also reduced.  As a consequence, the
   scalability of DPD is much better than keepalives and heartbeats.

   DPD maintains the HELLO/ACK model presented by keepalives, as it
   follows that an exchange is initiated only by an entity interested in
   the liveliness of its peer.

6.  Resistance to Replay Attack and False Proof of Liveliness

6.1.  Sequence Number in DPD Messages

   To guard against message replay attacks and false proof of
   liveliness, a 32-bit sequence number MUST be presented with each R-
   U-THERE message.  A responder to an R-U-THERE message MUST send an
   R-U-THERE-ACK with the same sequence number.  Upon receipt of the R-
   U-THERE-ACK message, the initial sender SHOULD check the validity of
   the sequence number.  The initial sender SHOULD reject the R-U-
   THERE-ACK if the sequence number fails to match the one sent with the
   R-U-THERE message.

   Additionally, both the receiver of the R-U-THERE and the R-U-THERE-
   ACK message SHOULD check the validity of the Initiator and Responder
   cookies presented in the SPI field of the payload.

Huang, et al.                Informational                     [Page 10]
RFC 3706                Detecting Dead IKE Peers           February 2004

6.2.  Selection and Maintenance of Sequence Numbers

   As both DPD peers can initiate a DPD exchange (i.e., both peers can
   send R-U-THERE messages), each peer MUST maintain its own sequence
   number for R-U-THERE messages.  The first R-U-THERE message sent in a
   session MUST be a randomly chosen number.  To prevent rolling past
   overflowing the 32-bit boundary, the high-bit of the sequence number
   initially SHOULD be set to zero.  Subsequent R-U-THERE messages MUST
   increment the sequence number by one.  Sequence numbers MAY reset at
   the expiry of the IKE SA, moving to a newly chosen random number.
   Each entity SHOULD also maintain its peer's R-U-THERE sequence
   number, and an entity SHOULD reject the R-U-THERE message if it fails
   to match the expected sequence number.

   Implementations MAY maintain a window of acceptable sequence numbers,
   but this specification makes no assumptions about how this is done.
   Again, it is an implementation specific detail.

7.  Security Considerations

   As the previous section highlighted, DPD uses sequence numbers to
   ensure liveliness.  This section describes the advantages of using
   sequence numbers over random nonces to ensure liveliness.

   While sequence numbers do require entities to keep per-peer state,
   they also provide an added method of protection in certain replay
   attacks.  Consider a case where peer A sends peer B a valid DPD R-U-
   THERE message.  An attacker C can intercept this message and flood B
   with multiple copies of the messages.  B will have to decrypt and
   process each packet (regardless of whether sequence numbers or nonces
   are in use).  With sequence numbers B can detect that the packets are
   replayed: the sequence numbers in these replayed packets will not
   match the incremented sequence number that B expects to receive from
   A.  This prevents B from needing to build, encrypt, and send ACKs.
   By contrast, if the DPD protocol used nonces, it would provide no way
   for B to detect that the messages are replayed (unless B maintained a
   list of recently received nonces).

   Another benefit of sequence numbers is that it adds an extra
   assurance of the peer's liveliness.  As long as a receiver verifies
   the validity of a DPD R-U-THERE message (by verifying its incremented
   sequence number), then the receiver can be assured of the peer's
   liveliness by the very fact that the sender initiated the query.
   Nonces, by contrast, cannot provide this assurance.

Huang, et al.                Informational                     [Page 11]
RFC 3706                Detecting Dead IKE Peers           February 2004

8.  IANA Considerations

   There is no IANA action required for this document.  DPD uses notify
   numbers from the private range.

9.  References

9.1.  Normative Reference

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

9.2.  Informative References

   [2]  Harkins, D. and D. Carrel, "The Internet Key Exchange (IKE)",
        RFC 2409, November 1998.

   [3]  Kent, S. and R. Atkinson, "Security Architecture for the
        Internet Protocol", RFC 2401, November 1998.

10.  Editors' Addresses

   Geoffrey Huang
   Cisco Systems, Inc.
   170 West Tasman Drive
   San Jose, CA 95134

   Phone: (408) 525-5354
   EMail: ghuang@cisco.com

   Stephane Beaulieu
   Cisco Systems, Inc.
   2000 Innovation Drive
   Kanata, ON
   Canada, K2K 3E8

   Phone: (613) 254-3678
   EMail: stephane@cisco.com

   Dany Rochefort
   Cisco Systems, Inc.
   124 Grove Street, Suite 205
   Franklin, MA 02038

   Phone: (508) 553-8644
   EMail: danyr@cisco.com

Huang, et al.                Informational                     [Page 12]
RFC 3706                Detecting Dead IKE Peers           February 2004

11.  Full Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78 and
   except as set forth therein, the authors retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
   REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
   INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
   IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed
   to pertain to the implementation or use of the technology
   described in this document or the extent to which any license
   under such rights might or might not be available; nor does it
   represent that it has made any independent effort to identify any
   such rights.  Information on the procedures with respect to
   rights in RFC documents can be found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use
   of such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository
   at http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention
   any copyrights, patents or patent applications, or other
   proprietary rights that may cover technology that may be required
   to implement this standard.  Please address the information to the
   IETF at ietf-ipr@ietf.org.

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.

Huang, et al.                Informational                     [Page 13]
#x27; Addresses