Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.
Congestion Exposure (ConEx) Concepts and Abstract Mechanism
draft-ietf-conex-abstract-mech-04

Versions:
03
04
05
06
07
08
09
10
11
12
13
RFC 7713
The information below is for an old version of the document.
Document	Type	This is an older version of an Internet-Draft that was ultimately published as RFC 7713.
	Authors	Matt Mathis , Bob Briscoe
	Last updated	2012-03-12
	RFC stream	Internet Engineering Task Force (IETF)
	Formats	txt htmlized pdf bibtex bibxml
	Reviews	GENART Telechat review (of -13) by Robert Sparks Ready SECDIR Last Call review (of -12) by Donald Eastlake Has nits GENART Last Call review (of -12) by Robert Sparks Ready
	Additional resources	Mailing list discussion
Stream	WG state	WG Document
	Document shepherd	(None)
IESG	IESG state	Became RFC 7713 (Informational)
	Consensus boilerplate	Unknown
	Telechat date	(None)
	Responsible AD	Wesley Eddy
	Send notices to	conex-chairs@tools.ietf.org, draft-ietf-conex-abstract-mech@tools.ietf.org
Email authors Email WG IPR 2 References Referenced by Nits Search email archive
draft-ietf-conex-abstract-mech-04
Congestion Exposure (ConEx) Working                            M. Mathis
Group                                                        Google, Inc
Internet-Draft                                                B. Briscoe
Intended status: Informational                                        BT
Expires: September 13, 2012                               March 12, 2012

      Congestion Exposure (ConEx) Concepts and Abstract Mechanism
                   draft-ietf-conex-abstract-mech-04

Abstract

   This document describes an abstract mechanism by which senders inform
   the network about the congestion encountered by packets earlier in
   the same flow.  Today, network elements at any layer may signal
   congestion to the receiver by dropping packets or by ECN markings,
   and the receiver passes this information back to the sender in
   transport-layer feedback.  The mechanism described here enables the
   sender to also relay this congestion information back into the
   network in-band at the IP layer, such that the total amount of
   congestion from all elements on the path is revealed to all IP
   elements along the path, where it could, for example, be used to
   provide input to traffic management.  This mechanism is called
   congestion exposure or ConEx.  The companion document "ConEx Concepts
   and Use Cases" provides the entry-point to the set of ConEx
   documentation.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 13, 2012.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

Mathis & Briscoe       Expires September 13, 2012               [Page 1]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  7
   2.  Requirements for the ConEx Abstract Mechanism  . . . . . . . .  7
     2.1.  Requirements for ConEx Signals . . . . . . . . . . . . . .  7
     2.2.  Requirements for the Audit Function  . . . . . . . . . . .  8
     2.3.  Requirements for non-abstract ConEx specifications . . . .  9
   3.  Encoding Congestion Exposure . . . . . . . . . . . . . . . . . 10
     3.1.  Naive Encoding . . . . . . . . . . . . . . . . . . . . . . 10
     3.2.  Null Encoding  . . . . . . . . . . . . . . . . . . . . . . 11
     3.3.  ECN Based Encoding . . . . . . . . . . . . . . . . . . . . 11
     3.4.  Independent Bits . . . . . . . . . . . . . . . . . . . . . 12
     3.5.  Codepoint Encoding . . . . . . . . . . . . . . . . . . . . 12
     3.6.  Units Implied by an Encoding . . . . . . . . . . . . . . . 13
   4.  Congestion Exposure Components . . . . . . . . . . . . . . . . 15
     4.1.  Network Devices (Not modified) . . . . . . . . . . . . . . 15
     4.2.  Modified Senders . . . . . . . . . . . . . . . . . . . . . 15
     4.3.  Receivers (Optionally Modified)  . . . . . . . . . . . . . 15
     4.4.  Policy Devices . . . . . . . . . . . . . . . . . . . . . . 16
       4.4.1.  Congestion Monitoring Devices  . . . . . . . . . . . . 16
       4.4.2.  Rest-of-Path Congestion Monitoring . . . . . . . . . . 16
       4.4.3.  Congestion Policers  . . . . . . . . . . . . . . . . . 17
     4.5.  Audit  . . . . . . . . . . . . . . . . . . . . . . . . . . 17
       4.5.1.  Using Credit to Simplify Audit . . . . . . . . . . . . 20
   5.  Support for Incremental Deployment . . . . . . . . . . . . . . 20
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 22
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 23
   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23
   9.  Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 23
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 23
     10.2. Informative References . . . . . . . . . . . . . . . . . . 23

Mathis & Briscoe       Expires September 13, 2012               [Page 2]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

1.  Introduction

   This document describes an abstract mechanism by which, to a first
   approximation, senders inform the network about the congestion
   encountered by packets earlier in the same flow.  It is not a
   complete protocol specification, because it is known that designing
   an encoding (e.g. packet formats, codepoint allocations, etc) is
   likely to entail compromises that preclude some uses of the protocol.
   The goal of this document is to provide a framework for developing
   and testing algorithms to evaluate the benefits of the ConEx protocol
   and to evaluate the consequences of the compromises in various
   different encoding designs.

   A companion document [I-D.ietf-conex-concepts-uses] provides the
   entry point to the set of ConEx documentation.  It outlines concepts
   that are pre-requisites to understanding why ConEx is useful, and it
   outlines various ways that ConEx might be used.

   As transport protocols continually seek out more network capacity,
   network elements signal whenever congestion results, and the
   transports are responsible for controlling this network
   congestion.The more a transport tries to use capacity that others
   want to use, the more congestion signals will be attributable to that
   transport.  Likewise, the more transport sessions sustained by a user
   and the longer the user sustains them, the more congestion signals
   will be attributable to that user.  ConEx ensures that the resulting
   congestion signals are sufficiently visible and robust, because they
   are an ideal metric for networks to use as the basis of traffic
   management or other related functions.

   Networks indicate congestion by three possible signals: packet loss,
   ECN marking or queueing delay.  ECN marking and some packet loss may
   be the outcome of Active Queue Management (AQM), which the network
   uses to warn senders to reduce their rates.  Packet loss is also the
   natural consequence of complete exhaustion of a buffer or other
   network resource.  Some experimental transport protocols and TCP
   variants infer impending congestion from increasing queuing delay.
   However, delay is too amorphous to use as a congestion metric.
   Therefore ConEx is only concerned with ECN markings and packet
   losses, because they are unambiguous signals of congestion.

   In both cases the congestion signals follow the route indicated in
   Figure 1.  A congested network device sends a signal in the data
   stream on the forward path to the transport receiver, the receiver
   passes it back to the sender through transport level feedback, and
   the sender makes some congestion control adjustment.

   This document extends the capabilities of the Internet protocol suite

Mathis & Briscoe       Expires September 13, 2012               [Page 3]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   with the addition of a new Congestion Exposure signal.  To a first
   approximation this signal, also shown in Figure 1, relays the
   congestion information from the transport sender back through the
   internetwork layer where it is visible to all internetwork layer
   devices along the forward path.  This document frames the engineering
   problem of designing the ConEx signal.  The requirements are
   described in Section 2 and some example encoding are presented in
   Section 3.

   This new signal is expressly designed to support a variety of new
   policy mechanisms that might be used to instrument, monitor or manage
   traffic.  The policy devices are not shown in Figure 1 but might be
   placed anywhere along the forward data path.  They are described in
   Section 4.4

   ,---------.                                               ,---------.
   |Transport|                                               |Transport|
   | Sender  |   .                                           |Receiver |
   |         |  /|___________________________________________|         |
   |     ,-<---------------Congestion-Feedback-Signals--<--------.     |
   |     |   |/                                              |   |     |
   |     |   |\           Transport Layer Feedback Flow      |   |     |
   |     |   | \  ___________________________________________|   |     |
   |     |   |  \|                                           |   |     |
   |     |   |   '         ,-----------.               .     |   |     |
   |     |   |_____________|           |_______________|\    |   |     |
   |     |   |    IP Layer |           |  Data Flow      \   |   |     |
   |     |   |             |(Congested)|                  \  |   |     |
   |     |   |             |  Network  |--Congestion-Signals--->-'     |
   |     |   |             |  Device   |                    \|         |
   |     |   |             |           |                    /|         |
   |     `----------->--(new)-IP-Layer-ConEx-Signals-------->|         |
   |         |             |           |                  /  |         |
   |         |_____________|           |_______________  /   |         |
   |         |             |           |               |/    |         |
   `---------'             `-----------'               '     `---------'

   Not shown are policy devices that use the ConEx Signal to monitor or
   manage traffic and audit devices to monitor the accuracy of ConEx
   signals.  These devices might be anywhere along the forward path.
   The are discussed in detail in Section 4.4 and Section 4.5,
   respectively.

                                 Figure 1

   Since the policy devices can affect how traffic is treated it is
   assumed that there is an intrinsic motivation for users, applications
   or operating systems to understate the congestion that they are

Mathis & Briscoe       Expires September 13, 2012               [Page 4]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   causing.  It is important to be able to audit ConEx signals, and to
   be able apply sufficient sanction to discourage cheating of
   congestion policies.  The general approach to auditing is to count
   and compare congestion signals and ConEx signals on the forward path.
   Many ConEx design constraints come from the need to assure that the
   audit function is sufficiently robust.  The audit function is
   described in Section 4.5, however significant portions of this
   document (and prior research[Refb-dis]) is motivated by issues
   relating to the audit function and making it robust.

   The congestion and ConEx signals shown in Figure 1 represent a series
   of discrete events: ECN marks or lost packets, carried by the forward
   data stream and fed back into the Internetwork layer.  The policy and
   audit functions are most likely to act on the accumulated values of
   these signals, for which we use the term "volume".  For example
   traffic volume is the total number of bytes delivered, optionally
   over a specified time interval and over some aggregate of traffic
   (e.g. all traffic from a site).  While loss-volume is the total
   amount of bytes discarded from some aggregate over an interval.  The
   term congestion-volume is defined precisely in
   [I-D.ietf-conex-concepts-uses].  Note that volume per unit time is a
   rate.

   One of the design goals of the ConEx protocol is that none of the
   important policy mechanisms requires per flow state, and that policy
   mechanisms can be implemented for heavily aggregated traffic in the
   core of the Internet with complexity akin to accumulating marking
   volumes per logical link.  Ideally it would also be possible to audit
   ConEx signals without per flow state, however this is not always
   possible.  Since auditing can be done near the edges of the network
   where traffic is less aggregated, per flow state is more easily
   tolerated.  Also, the flow-state required for audit creates itself as
   it detects new flows.  Therefore a flow will not fail if it is re-
   routed away from the audit box currently holding its flow-state.
   [g]Flow-state for auditing is discussed further in Section 4.5.  In
   summary: i) flow state for auditing does not require route pinning;
   ii) auditing at the edges, with limited per flow state, enables
   policy in the core, without any per flow state.

   There is a long standing argument over units of congestion: bytes vs
   packets (see [I-D.ietf-tsvwg-byte-pkt-congest] and its references).
   This document does not take a strong position on this issue.
   However, we make the following observations: the most expensive links
   in the Internet, in terms of cost per bit, are all at lower data
   rates, where transmission times are large and packet sizes are
   important.  In order for a policy to consider wire time, it needs to
   know the number of congested bytes.  However, high speed networking
   equipment and the transport protocols themselves typically gauge

Mathis & Briscoe       Expires September 13, 2012               [Page 5]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   resource consumption and congestion in terms of packets.  This may
   prove to be problematic for application protocols that have irregular
   packet sizes, such as BGP, SPDY and some variable rate video encoding
   schemes.  The units of congestion must be an explicitly stated
   property of any proposed encoding, and the consequences of that
   design decision must be evaluated along with other aspects of the
   design.

   To be successful the ConEx protocol must have the property that the
   relevant stakeholders each have the incentive to unilaterally start
   on each stage of partial deployment, which in turn creates incentives
   for further deployment.  Furthermore, legacy systems that will never
   be upgraded do not become a barrier to deploying ConEx.  Issues
   relating to partial deployment are described in Section 5.

   Note that ConEx signals are not intended to be used for fine-grained
   congestion control.  They are anticipated to be most useful at longer
   time scales, for example the total congestion caused by a user might
   be serve as an input to higher level policy or accountability
   functions, designed to create incentives for improving user behavior,
   such as choosing to send large quantities of data at off peak times,
   at lower data rates or with less aggressive protocols such as
   LEDBAT[I-D.ietf-ledbat-congestion] (see
   [I-D.ietf-conex-concepts-uses]).

   Ultimately ConEx signals have the potential to provide a mechanism to
   regulate global Internet congestion.  From the earliest days of
   congestion control research there has been a concern that there is no
   mechanism to prevent transport designers from incrementally making
   protocols more aggressive without bound and spiraling to a "tragedy
   of the commons" Internet congestion collapse.  The "TCP friendly"
   paradigm was created in part to forestall this failure.  However, it
   no longer commands any authority because it has little to say about
   the Internet of today, which has moved beyond the scaling range of
   standard TCP.  Therefore most transports and applications are opening
   arbitrarily large numbers of connections or using arbitrary levels of
   aggressiveness.  ConEx represents a recognition that the IETF cannot
   regulate this space directly because it concerns the behaviour of
   users and applications, not individual transport protocols.  Instead
   the IETF can give network operators the protocol tools to arbitrate
   the space themselves, with better bulk traffic management.  This in
   turn should create incentives for users, and designers of application
   and of transport protocols to be more mindful about contributing to
   congesting.

Mathis & Briscoe       Expires September 13, 2012               [Page 6]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

1.1.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   ConEx signals in IP packet headers from the sender to the network:
   Not-ConEx:  The transport is not ConEx-capable.
   ConEx-Capable:  The transport is ConEx-Capable.  This is the opposite
      of Not-ConEx.
   ConEx Signal:  A packet sent by a ConEx Capable transport.  It
      carries at least one of the following signals:
      Re-Echo-Loss:  The transport has experienced a loss.
      Re-Echo-ECN:  The transport has experienced an ECN mark.
      Credit:  The transport is building up credit to allow for any
         future delay in expected ConEx signals (see Section 4.5.1)
      ConEx-Not-Marked:  The transport is ConEx-capable but is signaling
         none of Re-Echo-Loss, Re-Echo-ECN or Credit.
   ConEx-Marked:  At least one of Re-Echo-Loss, Re-Echo-ECN or Credit.

2.  Requirements for the ConEx Abstract Mechanism

   First time readers may wish to skim this section, since it is more
   understandable having read the entire document.

2.1.  Requirements for ConEx Signals

   Ideally, all the following requirements would be met by a Congestion
   Exposure Signal.  However it is already known that some compromises
   will be necessary, and therefore all the requirements are expressed
   with the keyword 'SHOULD' rather than 'MUST'.  The only mandatory
   requirement is that a concrete protocol description MUST give sound
   reasoning if it chooses not to meet a requirement:
   a.  The ConEx Signal SHOULD be visible to internetwork layer devices
       along the entire path from the transport sender to the transport
       receiver.  Equivalently, it SHOULD be present in the IPv4 or IPv6
       header, and in the outermost IP header if using IP in IP
       tunneling.  The ConEx Signal SHOULD be immutable once set by the
       transport sender.  A corollary of these requirements is that the
       chosen ConEx encoding SHOULD pass silently without modification
       through pre-existing networking gear.
   b.  The ConEx Signal SHOULD be useful under only partial deployment.
       A minimal deployment SHOULD only require changes to transport
       senders.  Furthermore, partial deployment SHOULD create
       incentives for additional deployment, both in terms of enabling
       ConEx on more devices and adding richer features to existing
       devices.  Nonetheless, ConEx deployment need never be universal,
       and it is anticipated that some hosts and some transports may

Mathis & Briscoe       Expires September 13, 2012               [Page 7]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

       never support the ConEx Protocol and some networks may never use
       the ConEx Signals.
   c.  The ConEx signal SHOULD be timely.  There will be a minimum delay
       of one RTT, and often longer if the transport protocol sends
       infrequent feedback (consider RTCP [RFC3550] for example).  This
       delay complicates auditing, and SHOULD be minimized.
   d.  The ConEx signal SHOULD be accurate and auditable.  The general
       approach is to observe the volume of congestion signals and ConEx
       signals on the forward data path and verify that the ConEx
       signals do not under-represent the congestion signals (see
       Section 4.5).  The simplest mechanism to compensate for the round
       trip delay between the signals is to include a "credit" signal to
       cover the yet to be observed congestion that might occur during
       this delay. (see Section 4.5.1 for details).  Furthermore, the
       ConEx signals for packet loss and ECN marking SHOULD have
       distinct encodings because they are likely to require different
       auditing techniques or vantage points.

2.2.  Requirements for the Audit Function

   The role and constraints on the audit function are described in
   Section 4.5.  There is no intention to standardise the audit
   function.  However, it is necessary to lay down the following
   normative constraints on audit behaviour so that transport designers
   will know what to design against and implementers of audit devices
   will know what pitfalls to avoid:
   Minimal False Hits:  Audit SHOULD introduce minimal false hits for
      honest flows;
   Minimal False Misses:  Audit SHOULD quickly detect and sanction
      dishonest flows, ideally on the first dishonest packet;
   Transport Oblivious:  Audit SHOULD NOT be designed around one
      particular rate response, such as any particular TCP congestion
      control algorithm or one particular resource sharing regime such
      as TCP-friendliness [RFC3448].  An important goal is to give
      ingress networks the freedom to unilaterally allow different rate
      responses to congestion and different resource sharing regimes
      [Evol_cc], without having to coordinate with other networks over
      details of individual flow behaviour;
   Sufficient Sanction:  Audit SHOULD introduce sufficient sanction
      (e.g. loss in goodput) so that senders cannot gain from
      understating congestion.  Audit sanctions SHOULD remove any gain
      from playing off losses at the audit function against higher
      allowed throughput at a congestion policer;
   Proportionate Sanction:  To the extent that the audit might be
      subject to false hits, the sanction SHOULD be proportionate to the
      degree to which congestion is understated.  If audit over-
      punishes, attackers will find ways to harness it into amplifying
      attacks on others.  Ideally audit should, in the long-run, cause

Mathis & Briscoe       Expires September 13, 2012               [Page 8]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

      the user to get no better performance than they would get by being
      accurate.
   Manage Memory Exhaustion:  Audit SHOULD be able to counter state
      exhaustion attacks.  For instance, if the audit function uses
      flow-state, it should not be possible for senders to exhaust its
      memory capacity by gratuitously sending numerous packets, each
      with a different flow ID.
   Identifier Accountability:  Audit SHOULD NOT be vulnerable to
      `identity whitewashing', where a transport can label a flow with a
      new ID more cheaply than paying the cost of continuing to use its
      current ID [CheapPseud];

2.3.  Requirements for non-abstract ConEx specifications

   An experimental ConEx specification SHOULD describe the following
   protocol details:
   Network Layer:
      A.  The specific ConEx signal encodings with packet formats, bit
          fields and/or code points;
      B.  An inventory of any conflated signals or any other effects
          that are known to compromise signal integrity;
      C.  A specification for signal units (bytes vs packets, etc), any
          approximations allowed and algorithms to do any implied
          conversions or accounting;
      D.  If the units are bytes a definition of which headers are
          included in the size of the packet;
      E.  How tunnels should propagate the ConEx encoding;
      F.  Whether the encoding fields are mutable or not, to ensure that
          header authentication, checksum calculation, etc. process them
          correctly.
      G.  Definition of any extensibility;
      H.  Backward and forward compatibility and potential migration
          strategies;
      I.  Any (hopefully optional) modification to data-plane forwarding
          dependent on the encoding (e.g. preferential discard,
          interaction with Diffserv, ECN etc.);
      J.  Any warning or error messages relevant to the encoding.
   Transport Layer:
      A.  A specification of any required changes to congestion feedback
          in particular transport protocols.
      B.  A specification (or minimally a recommendation) for how a
          transport should estimate credits at the beginning of a new
          connection.
      C.  @@@More TBA, incl ops & management@@@

Mathis & Briscoe       Expires September 13, 2012               [Page 9]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   Security:
      A.  An example of a strong audit algorithm suitable for detecting
          if a single flow is misstating congestion.  This algorithm
          should present minimal false results, but need not have
          optimal scaling properties (e.g. may need per flow state).
      B.  An example of an audit algorithm suitable for detecting
          misstated congestion in a large aggregate (e.g. no per-flow
          state).

   The possibility exists that these specifications over constrain the
   ConEx design, and can not be fully satisfied.  An important part of
   the evaluation of any particular design will be a thorough inventory
   of all ways in which it might fail to satisfy these specifications.

3.  Encoding Congestion Exposure

   Most protocol specifications start with a description of packet
   formats and codepoints with their associated meanings.  This document
   does not: It is already known that choosing the encoding for ConEx is
   likely to entail some engineering compromises that have the potential
   to reduce the protocol's usefulness in some settings.  For instance
   the experimental ConEx encoding chosen for IPv6
   [I-D.ietf-conex-destopt] had to make compromises on tunnelling.
   Rather than making these engineering choices prematurely, this
   document side steps the encoding problem by making it abstract.  It
   describes several different representations of ConEx Signals, none of
   which are specified to the level of specific bits or code points.

   The goal of this approach is to be as complete as possible for
   discovering the potential usage and capabilities of the ConEx
   protocol, so we have some hope of making optimal design decisions
   when choosing the encoding.  Even if experiments reveal particular
   problems due to the encoding, then this document will still serve as
   a reference model.

3.1.  Naive Encoding

   For tutorial purposes, it is helpful to describe a naive encoding of
   the ConEx protocol for TCP and similar protocols: set a bit (not
   specified here) in the IP header on each retransmission and on each
   ECN signaled window reduction.  Network devices along the forward
   path can see this bit and act on it.  For example any device along
   the path might limit the rate of all traffic if the rate of marked
   (congested) packets exceeds a threshold.

   This simple encoding is sufficient to illustrate many of the benefits
   envisioned for ConEx.  At first glance it looks like it might
   motivate people to deploy and use it.  It is a one line code change

Mathis & Briscoe       Expires September 13, 2012              [Page 10]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   that a small number of OS developers and content providers could
   unilaterally deploy across a significant fraction of all Internet
   traffic.  However, this encoding does not support auditing so it
   would also motivate users and/or applications to misrepresent the
   congestion that they are causing [RFC3514].  As a consequence the
   naive encoding is not likely to be trusted and thus creates its own
   disincentives for deployment.

   Nonetheless, this Naive encoding does present a clear mental model of
   how the ConEx protocol might function under various uses.  It is
   useful for thought experiments where it can be stipulated that all
   participants are honest and it does illustrate some of the incentives
   that might be introduced by ConEx.

3.2.  Null Encoding

   In limited contexts is possible to implement ConEx like functions
   without any signals at all by measuring rest-of-path congestion
   directly from TCP headers.  The algorithm is to keep at least one RTT
   of past TCP headers and matching each new header against the history
   to count duplicate data.

   This could implement many ConEx policies, without any explicit
   protocol.  It is fairly easy to implement, at least at low rate (e.g.
   in a software based edge router).  However, it would only be useful
   in cases where the network operator can see the TCP headers.  This is
   currently (2012) the vast majority of traffic because UDP, IPSEC and
   VPN tunnels are used far less than SSL or TLS over TCP/IP, which do
   not hide TCP sequence numbers from network devices.  However, anyone
   specifically intending to avoid the attention of a congestion policy
   device would only have to hide their TCP headers from the network
   operator (e.g. by using a VPN tunnel).

3.3.  ECN Based Encoding

   The re-ECN specification [I-D.briscoe-tsvwg-re-ecn-tcp] presents an
   IPv4 implementation of ConEx that was tightly integrated with ECN
   encoding in order to fit into the IPv4 header.  ConEx and ECN are
   orthogonal signals in the sense that any individual packet may need
   to represent any one of the 4 possible combinations of signal values.
   Ideally their encoding should be entirely independent.  However,
   given the limited number of header bit and/or code points, these
   signals had to partially share code points.

   The central theme of the re-ECN work was an audit mechanism that
   provides sufficient disincentives against misrepresenting congestion
   [I-D.briscoe-tsvwg-re-ecn-motiv].  It is analyzed extensively in
   Briscoe's PhD dissertation [Refb-dis].

Mathis & Briscoe       Expires September 13, 2012              [Page 11]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   Re-ECN is an example of one chosen set of compromises attempting to
   meet the requirements of Section 2.  The present document takes a
   step back, aiming to state the ideal requirements in order to allow
   the Internet community to assess whether different compromises might
   be better.

   The problem with Re-ECN is that it requires that receivers be ECN
   enabled in addition to sender changes.  Newer encodings overcome this
   problem by being able to represent both loss and ECN based
   congestion, and assuming that both signals must be supported
   indefinitely.

   For a tutorial background on re-ECN motivation and techniques, see
   [Re-fb, FairerFaster].

3.4.  Independent Bits

   This encoding involves flag bits, each of which the sender can set
   independently to indicate to the network one of the following four
   signals:
   ConEx (Not-ConEx)  The transport is (or is not) using ConEx with this
      packet (the protocol MUST be arranged so that legacy transport
      senders implicitly send Not-ConEx)
   Re-Echo-Loss (Not-Re-Echo-Loss)  The transport has (or has not)
      experienced a loss
   Re-Echo-ECN (Not-Re-Echo-ECN)  The transport has (or has not)
      experienced ECN-signaled congestion
   Credit (Not-Credit)  The transport is (or is not) building up
      congestion credit (see Section 4.5 on the audit function)

   This encoding does not imply any exclusion property among the
   signals.  Multiple types of congestion (ECN, loss) can be signalled
   on the same ACK.  As long as the packets in a flow have uniform
   sizes, it does not matter whether the units of congestion are packets
   or bytes.  However, if an application sends very irregular packet
   sizes, it may be necessary for the sender to mark multiple packets to
   avoid being in technical violation of the audit function.

3.5.  Codepoint Encoding

   This encoding involves signaling one of the following five
   codepoints:

   ENUM {Not-ConEx, ConEx-Not-Marked, Re-Echo-Loss, Re-Echo-ECN, Credit}

   Each named codepoint has the same meaning as in the encoding using
   independent bits in the previous section.  The use of any one
   codepoint implies the negative of all the others.

Mathis & Briscoe       Expires September 13, 2012              [Page 12]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   Inherently, the semantics of most of the enumerated codepoints are
   mutually exclusive.  'Credit' is the only one that might need to be
   used in combination with either Re-Echo-Loss or Re-Echo-ECN, but even
   that requirement is questionable.  It must not be forgotten that the
   enumerated encoding loses the flexibility to signal these two
   combinations, whereas the encoding with four independent bits is not
   so limited.  Alternatively two extra codepoints could be assigned to
   these two combinations of semantics.  The comment in the previous
   section about units also applies.

3.6.  Units Implied by an Encoding

   The following comments apply generally to all the other encodings.

   Congestion can be due to exhaustion of bit-carry capacity, or
   exhaustion of packet processing power.  When a packet is discarded or
   marked to indicate congestion, there is no easy way to know whether
   the lost or marked packet signifies bit-congestion or packet-
   congestion.  The above ConEx encodings that rely on marking packets
   suffer from the same ambiguity.

   This problem is most acute when audit needs to check that one count
   of markings matches another.  For example if there are ConEx markings
   on three large (1500B) packets, is that sufficient to match the loss
   of 5 small (60B) packets?  If a packet-marking is defined to mean all
   the bytes in the packet are marked, then we have 4500B of Conex
   marked data against 300B of lost data, which is easily sufficient.
   If instead we are counting packets, then we have 3 ConEx packets
   against 5 lost packets, which is not sufficient.  This problem will
   not arise when all the packets in a flow are the same size, but a
   choice needs to be made for flows in which packet sizes vary.

   We could require that a ConEx encoding specifies whether ConEx
   markings are in units of bytes or packets.  But the problem is deeper
   than that: we do not even know whether congestion signals themselves
   (loss & ECN) are in units of bytes or packets.

   Therefore a ConEx encoding SHOULD specify whether it assumes units of
   bytes or packets for both ConEx markings and for congestion
   indications.

   [I-D.ietf-tsvwg-byte-pkt-congest] advises that congestion indications
   SHOULD be interpreted in units of bytes when responding to
   congestion, at least on today's Internet.  In any TCP implementation
   this is simple to achieve for varying size packets, given TCP SACK
   tracks losses in bytes.

   For example, to implement ConEx in bytes, the sender maintains a

Mathis & Briscoe       Expires September 13, 2012              [Page 13]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   counter of outstanding bytes to be ConEx-marked.  When the SACK
   options report the size of a loss, this is added to the counter, and
   whenever the counter is positive the next data packet is ConEx-marked
   and its size subtracted from the counter.  Then, if one 1500B packet
   is lost, even if subsequent packets to be sent are all 600B, the
   sender will compensate by Conex-marking enough small packets.  In
   this case, the sender will ConEx-mark the next three 600B packets
   before the counter goes negative (1500 - 3*600 = -300), which
   indicates that it has sent sufficient ConEx marked small packets to
   compensate for the lost large packet.  It will hold over the negative
   remainder towards the next loss.  As long as the remainder is kept
   negative, the ConEx markings will be on the safe side for audit
   purposes.

   With TCP-ECN the sender knows the size in bytes of packets going out,
   but ECN feedback is in units of packets not bytes.  In some TCP
   implementations, ECN markings are easy to convert to marked bytes,
   while in others it requires significant work.  Therefore even if a
   ConEx encoding specifies that markings should be interpreted in
   bytes, it SHOULD allow implementers some leeway to approximate.
   Experiments with these approximations will determine whether they are
   sufficient for different patterns of packet size variations.

   If an encoding is specified in units of bytes, the encoding SHOULD
   also specify which headers to include in the size of a packet.  Bit-
   congestion is caused by all the bits transmitted with packets,
   including lower layer frame headers, trailers etc.  However, a
   transport endpoint cannot know the size of the frame header on a
   packet when it caused congestion at some other link in the Internet,
   or what size frame header will be used at the audit function.
   Therefore, it will be practical to define the size of a packet as
   including the layer 3 header that encapsulates the transport header
   associated with the ConEx transport sender, but not any more lower
   layer headers, nor any tunnel headers (which a transport is unlikely
   to be aware of anyway, because they will already have been stripped
   before the transport sees the segment).

   It is appropriate to defer the definition of units to the (non-
   abstract) encoding specification, because this choice will need to be
   made in normative language, and the present document is only
   informative.  It may seem that this could lead to interoperability
   problems if more than one encoding is specified.  However, one
   encoding is unlikely to have to interact with another: the
   interactions between ConEx implementations in senders, policy devices
   and audit devices can only happen in the context of one encoding on
   the wire.

Mathis & Briscoe       Expires September 13, 2012              [Page 14]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

4.  Congestion Exposure Components

   The components shown in Figure 1 are described in more detail.

4.1.  Network Devices (Not modified)

   Congestion signals originate from network devices as they do today.
   A congested router, switch or other network device can discard or ECN
   mark packets when it is congested.

4.2.  Modified Senders

   The sending transport needs to be modified to send Congestion
   Exposure Signals in response to congestion feedback signals (For
   example see [I-D.conex-tcp-mods]).  We want to permit ConEx without
   ECN (e.g. if the receiver does not support ECN).  However, we want to
   encourage a ConEx sender to at least attempt to negotiate ECN,
   because it is believed that ConEx without ECN is harder to audit, and
   thus potentially exposed to fraud.  Since honest users have the
   potential to benefit from stronger mechanisms to manage traffic they
   have an incentive to deploy ConEx and ECN together.  This incentive
   is not sufficient to prevent a dishonest user from constructing (or
   configuring) a sender that enables ConEx after choosing not to
   negotiate ECN, but is should be sufficient to prevent this from being
   the sustained default case for any significant pool of users.

   Permitting ConEx without ECN is necessary to facilitate bootstrapping
   other parts of ConEx deployment.

4.3.  Receivers (Optionally Modified)

   Any receiving transport may already feedback sufficiently useful
   signals to the sender so that it does not need to be altered.

   If the transport receiver does not support ECN, then it's native loss
   signaling mechanism (required for compliance with existing congestion
   control standards) will be sufficient for the Sender to generate
   ConEx signals.

   A traditional ECN implementation (RFC 3168 for TCP) signals
   congestion no more than once per round trip.  The sender may require
   more precise feedback from the receiver otherwise it is at risk of
   appearing to be understating its ConEx Signals.

   Ideally, ConEx should be added to a transport like TCP without
   mandatory modifications to the receiver.  But an optional
   modification to the receiver could be recommended for precision (see
   [I-D.conex-accurate-ecn]).  This was the approach taken when adding

Mathis & Briscoe       Expires September 13, 2012              [Page 15]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   re-ECN to TCP [I-D.briscoe-tsvwg-re-ecn-tcp].

4.4.  Policy Devices

   Policy devices are characterised by a need to be configured with a
   policy related to the users or neighboring networks being served.  In
   contrast, the auditing devices referred to in the previous section
   primarily enforce compliance with the ConEx protocol and do not need
   to be configured with any client-specific policy.

4.4.1.  Congestion Monitoring Devices

   Policy devices can typically be decomposed into two functions i)
   monitoring the ConEx signal to compare it with a policy then ii)
   acting in some way on the result.  Various actions might be invoked
   against 'out of contract' traffic, such as policing (see
   Section 4.4.3), re-routing, or downgrading the class of service.

   Alternatively a policy device might not act directly on the traffic,
   but instead report to management systems that are designed to control
   congestion indirectly.  For instance the reports might trigger
   capacity upgrades, penalty clauses in contracts, levy charges between
   networks based on congestion, or merely send warnings to clients who
   are causing excessive congestion.

   Nonetheless, whatever action is invoked, the congestion monitoring
   function will always be a necessary part of any policy device.

4.4.2.  Rest-of-Path Congestion Monitoring

   ConEx signals indicate the level of congestion along a whole path
   from source to destination.  In contrast when ECN signals are
   monitored in the middle of a network, they indicate the level of
   congestion experienced so far on the path.

   If a monitor in the middle of a network (e.g. at a network border)
   measures both of these signals, it can subtract the level of ECN
   (path so far) from the level of ConEx (whole path) to derive a
   measure of the congestion that packets are likely to experience
   between the monitoring point and their destination (rest-of-path
   congestion).

   It will often be preferable for policy devices to monitor rest-of-
   path congestion if they can, because it is a measure of the
   downstream congestion that the policy device can directly influence
   by controlling the traffic passing through it.

Mathis & Briscoe       Expires September 13, 2012              [Page 16]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

4.4.3.  Congestion Policers

   A congestion policer can be implemented in a very similar way to a
   bit-rate policer, but its effect can be focused solely on traffic
   causing congestion downstream, which ConEx signals make visible.
   Without ConEx signals, the only way to mitigate congestion is to
   blindly limit traffic bit-rate, on the assumption that high bit-rate
   is more likely to cause congestion.

   A congestion policer monitors all ConEx traffic entering a network,
   or some identifiable subset.  Using ConEx signals (and preferably
   subtracting ECN signals to yield rest-of-path congestion), it
   measures the amount of congestion that this traffic is contributing
   somewhere downstream.  If this exceeds a policy-configured
   'congestion-bit-rate' the congestion policer can limit all the
   monitored ConEx traffic.

   A congestion policer can be implemented by a simple token bucket.
   But unlike a bit-rate policer, it removes a token only when it
   forwards a packet that is ConEx-Marked, effectively treating Not-
   ConEx-Marked packets as invisible.  Consequently, because tokens give
   the right to send congested bits, the fill-rate of the token bucket
   will represent the allowed congestion-bit-rate.  This should provide
   sufficient traffic management without having to additionally
   constrain the straight bit-rate at all.  See [CongPol] for details.

   Note that the policing action is to introduce a throttle (delay
   through traffic) immediately upstream of the congestion policer.
   This throttle is likely to include a queue with its own AQM, which
   potentially increases the whole path congestion, to reduce the rest
   of path congestion.

4.5.  Audit

   The most critical aspect of ConEx is the capability to support robust
   auditing.  It can be assumed that there will be an intrinsic
   motivation for users to understate the congestion that they are
   causing.  Without strong audit functions the ConEx signal is likely
   to become inaccurate to the point being useless.  The most important
   feature of an encoding design is likely to be robustness of the
   auditing it supports.

   The general approach is to compare the volume of ConEx signals to
   direct measures of actual congestion volume.  The technique described
   in Section 4.5.1 can be used to guarantee that this is a strict
   bound: if the actual congestion exceeds the ConEx signal, then some
   congestion was understated and some sanction should be applied to the
   traffic.  Although sanctions are beyond the scope of this document,

Mathis & Briscoe       Expires September 13, 2012              [Page 17]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   an example sanction might be to throttle the traffic immediately
   upstream of the auditor to prevent the user from getting any
   advantage by understating congestion.  Such a throttle would likely
   include some combination of delaying, ECN marking or dropping
   traffic.

   This document does not preclude "statistical auditing", where the
   audit function indicates some sort of probability that a particular
   flow is under reporting congestion, however this design choice
   greatly complicates designing an appropriate sanction, because of the
   possibility of a false hit.

   To facilitate ConEx deployment, not-ConEx traffic might be treated as
   a special case of understating congestion, but with a different
   sanction.  For example an ISP might apply a data rate cap to not-
   ConEx traffic, while applying a congestion volume cap to ConEx marked
   traffic.  With suitable parameters this is likely to give ConEx
   marked traffic a much larger share of the network during off peak
   hours.  (Note that in this example the ConEx auditor is also acting
   as a ConEx policy device.)  Another option to facilitate deployment
   is for the auditor to act as a ConEx proxy, and insert ConEx signals
   in packets in behalf of the sender.  Such a device is outside of the
   scope of this document, but nonetheless potentially useful for
   supporting ConEx for legacy systems.

   Auditing can be distributed and redundant.  One flow may be audited
   in multiple places, using multiple techniques.  Some audit techniques
   do not require any per flow state and can be applied to aggregate
   traffic.  These might be able to detect the presence of understated
   congestion at large scale and support recursively hunting for
   individual flows that are understating their congestion.  Even at
   large scales, flows can be randomly selected for individual auditing.

   The auditing function should be able to trigger sufficient sanction
   to discourage understating congestion[Salvatori05].  This potentially
   requires designing the sanction in consort with the policy functions,
   even though they might be implemented in different parts of the
   network.  Note that in the future it might prove to be desirable to
   provide advise on uniformly implementing sanctions, because
   insufficient sanctions impairs the ability to implement policy
   elsewhere in the network.

   Some of the audit algorithms require per flow state.  This cost is
   expected to be tolerable, because these techniques are most apropos
   near the edges of the network, where traffic is generally much less
   aggregated, so the state need not overwhelm any one device.  Sampling
   techniques can also be used to bound the total auditing memory
   footprint, although the implementer must be wary of "identifier white

Mathis & Briscoe       Expires September 13, 2012              [Page 18]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   washing" attacks to hide cheating connection among chaff.

   At some point in the future, when ConEx is built into all transport
   protocol implementations, it may not be necessary to audit all
   traffic all the time.  Auditing might be needed only to identify
   rogue actors and prevent them from gaining any long term advantage by
   cheating.

   A ConEx auditor might use one of the following techniques:
   ECN Auditing:  Since the volume of ECN marks rises monotonically
      along a path, ECN auditing is most accurate when located near the
      transport receiver.  For this reason ECN should be monitored
      downstream of the predominant bottleneck.

      Note that this technique requires no per flow state.
   TCP-specific loss auditing:  For non-encrypted standard TCP traffic
      on a single path, an auditor could measure losses by detecting
      retransmissions, which appear as duplicate sequence numbers
      upstream of the loss and out of order data down stream of the
      loss.  Since some reordering is present in the Internet, such a
      loss estimator would be most accurate near the sender.
   Predominant bottleneck loss auditing:  For networks designed so that
      losses predominantly occur due to Active Queue Management under
      the control of one IP-aware node on the path, the auditor could be
      located at this bottleneck.  It could simply compare ConEx Signals
      with actual local packet discards.  This is a good model for most
      consumer access networks where audit accuracy could well be
      sufficient even if losses occasionally occur at other nodes in the
      network, such as border gateways.

      Although the auditor at the predominant bottleneck would not be
      able to count losses at other nodes, transports would not know
      where losses were occurring either.  Therefore a transport would
      not know which losses it could cheat and which ones it couldn't
      without getting caught.

      Note that this technique requires no per flow state.
   Generic loss auditing:  For congestion signaled by loss, totally
      accurate auditing is not believed to be possible in the general
      case, because it involves a network node detecting the absence of
      some packets, when it cannot necessarily identify retransmissions
      or missing packets.  Furthermore the missing packet might simply
      be taking a different route.

      It is for this reason that it is desirable to motivate the
      deploying of ECN, even though ECN is not strictly required for
      ConEx.
   In addition, other audit techniques may be identified in the future.

Mathis & Briscoe       Expires September 13, 2012              [Page 19]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

4.5.1.  Using Credit to Simplify Audit

   At the audit function, there will be an inherent delay of at least
   one round trip between a congestion signal and the subsequent ConEx
   signal it triggers, as shown in Figure 1.  However, the audit
   function cannot be expected to wait for a round trip to check that
   one signal balances the other, because that requires excessive state
   and the auditor can't easily determine the RTT of each transport.

   The simplest mechanism to compensate for the round trip delay between
   the signals is to include a "credit" signal to cover the yet to be
   observed congestion that might occur during this delay.  The
   transport signals sufficient credit in advance to cover congestion
   expected during its feedback delay.  Then, the audit function does
   not need to make allowance for round trip delays that it cannot
   quantify.  This design choice correctly makes the transport
   responsible for both minimizing feedback delay and for the risk that
   packets in flight will cause congestion to others before the source
   can react.

   For example, imagine the audit function keeps a running account of
   the balance between actual congestion signals (loss or ECN), which it
   counts as negative, and ConEx signals, which it counts as positive.
   Having made the transport responsible for round trip delays, it will
   be expected to have pre-loaded the audit function with some credit at
   the start.  Therefore, if ever the balance does go negative, the
   audit function can immediately start punishing a flow, without any
   grace period.

   Note that although per flow state might be required to count losses,
   this balance requirement applies both to individual flows and to flow
   aggregates.  For example with the "predominant bottleneck" approach
   in the previous section (which does not require per flow state), an
   auditor can detect understated congestion merely by comparing the
   total volume of ConEx signals (Re-Echo-Loss, Re-Echo-ECN and Credit)
   to the sum of the total volumes of AQM drops and ECN marks.

   A specific encoding SHOULD describe the tradeoffs of three
   interrelated design decisions: whether the audit is strict or
   statistical; how to recommend estimating the initial credit per flow;
   and to what extent the sanction needs to avoid over penalizing flows
   which a false audit failures.

5.  Support for Incremental Deployment

   The ConEx abstract protocol described so far is intended to support
   incremental deployment in every possible respect.  For convenience,
   the following list collects together all the features of ConEx that

Mathis & Briscoe       Expires September 13, 2012              [Page 20]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   support incremental deployment, and points to further information on
   each:
   Packets:  The wire protocol encoding allows each packet to indicate
      whether it is using ConEx or not (see Section 3 on Encoding
      Congestion Exposure).
   senders:  ConEx requires a modification to the source in order to
      send ConEx packet markings (see Section 4.2).  Although ConEx
      support can be indicated on a packet-by-packet basis, it is likely
      that all the packets in a flow will either consistently support
      ConEx or consistently not.  It is also likely that, if the
      implementation of a transport protocol supports ConEx, all the
      packets sent from that host using that protocol will be ConEx
      packets.

      The implementations of some of the transport protocols on a host
      might not support ConEx (e.g. the implementation of DNS over UDP
      might not support ConEx, while perhaps RTP over UDP and TCP will).
      Any non-upgraded transports and non-upgraded hosts will simply
      continue to send regular Not-ConEx packets as always.

      A network operator can create incentives for senders to
      voluntarily reveal ConEx information.  Without ConEx information,
      a network operator tends to have to limit the bit-rate or volume
      from a site more than is necessary, just in case it might congest
      others.  With ConEx information, the operator can solely limit
      congestion-causing traffic, and otherwise allow complete freedom.
      This greater freedom acts as an inducement for the source to
      volunteer ConEx information.
   Receivers:  A ConEx source should be able to work without a modified
      receiver.  However, without sufficiently precise congestion
      feedback from the receiver, the source may have to conservatively
      send extra Re-Echo markings in order to avoid understating
      congestion.  The need for more precise receiver feedback is not
      exclusive to ConEx, for instance Data Centre TCP (DCTCP [DCTCP])
      uses precise feedback to good effect.  Nonetheless, if a receiver
      offers precise feedback, it will be best if ConEx uses it (see
      Section 4.3).
   Proxies:  Although it was stated above that ConEx requires a
      modification to the source, ConEx signals could theoretically be
      introduced by a proxy for the source, as long as it can intercept
      feedback from the receiver.  Similarly, more precise feedback
      could thoretically be provided by a proxy for the receiver rather
      than modifying the receiver itself.
   Queues:  No modification to queues is needed for ConEx.

      However, once ConEx is deployed, it is possible that a queue
      implementation could take advantage of the ConEx information in
      packets.  For instance, it has been suggested

Mathis & Briscoe       Expires September 13, 2012              [Page 21]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

      [I-D.briscoe-tsvwg-re-ecn-tcp] that a queue would be more robust
      against flooding if it preferentially discarded Not-ConEx packets
      then Not-Marked ConEx packets.

      A ConEx sender re-echoes congestion whether the queues signaling
      congestion are ECN-enabled or not.  Nonetheless, auditing works
      best if most congestion is indicated by ECN rather than loss (see
      Section 2).  Also, monitoring rest-of-path congestion is not
      accurate if there are congested non-ECN queues upstream of the
      monitoring point (Section 4.4.2).
   Networks:  If a subset of traffic sources (or proxies) use ConEx
      signals to reveal congestion in the internetwork layer, a network
      operator can choose (or not) to use this information for traffic
      management.  As long as the end-to-end ConEx signals are present,
      each network can unilaterally choose to use them--independently of
      whether other networks do.

      ConEx packets may safely traverse a network that ignores them.
      Networks MUST NOT change ConEx packets to Not-ConEx.  If
      necessary, endpoints SHOULD be able to detect if a network is
      removing ConEx signals.

      An operator can deploy policy devices (Section 4.4) wherever
      traffic enters its network, in order to monitor the downstream
      congestion that incoming traffic contributes to, and control it if
      necessary.  See [I-D.ietf-conex-concepts-uses] for further
      discussion of deployment incentives for networks and scenarios
      where some networks use ConEx-based policy devices and other
      don't.

      An operator can deploy audit devices Section 4.5 unilaterally
      within its own network to verify that traffic sources are not
      understating ConEx information.  From the viewpoint of one network
      operator (say N_a), it only cares that the level of ConEx
      signaling is sufficient to cover congestion in its own network.
      If traffic continues into a congested downstream network (say
      N_b), it is of no concern to the first network (N_a) if the end-
      to-end ConEx signaling is insufficient to cover the congestion in
      N_b as well.  This is N-b's concern, and N_b can both detect such
      anomalous traffic and deal with it using ConEx-based policy
      devices (Section 4.4).

6.  IANA Considerations

   This memo includes no request to IANA.

   Note to RFC Editor: this section may be removed on publication as an
   RFC.

Mathis & Briscoe       Expires September 13, 2012              [Page 22]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

7.  Security Considerations

   Significant parts of this whole document are about auditability of
   ConEx Signals, in particular Section 4.5.

8.  Acknowledgements

   This document was improved by review comments from Toby Moncaster,
   Nandita Dukkipati, Mirja Kuehlewind, Caitlin Bestler and John Leslie.

9.  Comments Solicited

   Comments and questions are encouraged and very welcome.  They can be
   addressed to the IETF Congestion Exposure (ConEx) working group
   mailing list <conex@ietf.org>, and/or to the authors.

10.  References

10.1.  Normative References

   [RFC2119]                          Bradner, S., "Key words for use in
                                      RFCs to Indicate Requirement
                                      Levels", BCP 14, RFC 2119,
                                      March 1997.

10.2.  Informative References

   [CheapPseud]                       Friedman, E. and P. Resnick, "The
                                      Social Cost of Cheap Pseudonyms",
                                      Journal of Economics and
                                      Management Strategy 10(2)173--199,
                                      1998.

   [CongPol]                          Jacquet, A., Briscoe, B., and T.
                                      Moncaster, "Policing Freedom to
                                      Use the Internet Resource Pool",
                                      Proc ACM Workshop on Re-
                                      Architecting the Internet
                                      (ReArch'08) , December 2008, <http
                                      ://bobbriscoe.net/projects/
                                      refb/#polfree>.

   [DCTCP]                            Alizadeh, M., Greenberg, A.,
                                      Maltz, D., Padhye, J., Patel, P.,
                                      Prabhakar, B., Sengupta, S., and
                                      M. Sridharan, "Data Center TCP
                                      (DCTCP)", ACM SIGCOMM
                                      CCR 40(4)63--74, October 2010, <ht

Mathis & Briscoe       Expires September 13, 2012              [Page 23]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

                                      tp://portal.acm.org/
                                      citation.cfm?id=1851192>.

   [Evol_cc]                          Gibbens, R. and F. Kelly,
                                      "Resource pricing and the
                                      evolution of congestion control",
                                      Automatica 35(12)1969--1985,
                                      December 1999, <http://
                                      www.statslab.cam.ac.uk/~frank/
                                      evol.html>.

   [FairerFaster]                     Briscoe, B., "A Fairer, Faster
                                      Internet Protocol", IEEE
                                      Spectrum Dec 2008:38--43,
                                      December 2008, <http://
                                      bobbriscoe.net/projects/
                                      refb/#fairfastip>.

   [I-D.briscoe-tsvwg-re-ecn-motiv]   Briscoe, B., Jacquet, A.,
                                      Moncaster, T., and A. Smith, "Re-
                                      ECN: A Framework for adding
                                      Congestion Accountability to
                                      TCP/IP", draft-briscoe-tsvwg-re-
                                      ecn-tcp-motivation-02 (work in
                                      progress), October 2010.

   [I-D.briscoe-tsvwg-re-ecn-tcp]     Briscoe, B., Jacquet, A.,
                                      Moncaster, T., and A. Smith, "Re-
                                      ECN: Adding Accountability for
                                      Causing Congestion to TCP/IP",
                                      draft-briscoe-tsvwg-re-ecn-tcp-09
                                      (work in progress), October 2010.

   [I-D.conex-accurate-ecn]           Kuehlewind, M. and R.
                                      Scheffenegger, "Accurate ECN
                                      Feedback in TCP", draft-
                                      kuehlewind-conex-accurate-ecn-01
                                      (work in progress), October 2011.

   [I-D.conex-tcp-mods]               Kuehlewind, M. and R.
                                      Scheffenegger, "TCP modifications
                                      for Congestion Exposure", draft-
                                      kuehlewind-conex-tcp-
                                      modifications-00 (work in
                                      progress), July 2011.

   [I-D.ietf-conex-concepts-uses]     Briscoe, B., Woundy, R., and A.
                                      Cooper, "ConEx Concepts and Use

Mathis & Briscoe       Expires September 13, 2012              [Page 24]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

                                      Cases",
                                      draft-ietf-conex-concepts-uses-03
                                      (work in progress), October 2011.

   [I-D.ietf-conex-destopt]           Krishnan, S., Kuehlewind, M., and
                                      C. Ucendo, "IPv6 Destination
                                      Option for Conex",
                                      draft-ietf-conex-destopt-01 (work
                                      in progress), October 2011.

   [I-D.ietf-ledbat-congestion]       Shalunov, S., Hazel, G., and J.
                                      Iyengar, "Low Extra Delay
                                      Background Transport (LEDBAT)",
                                      draft-ietf-ledbat-congestion-03
                                      (work in progress), October 2010.

   [I-D.ietf-tsvwg-byte-pkt-congest]  Briscoe, B. and J. Manner, "Byte
                                      and Packet Congestion
                                      Notification", draft-ietf-tsvwg-
                                      byte-pkt-congest-03 (work in
                                      progress), October 2010.

   [I-D.sridharan-tcpm-ctcp]          Sridharan, M., Tan, K., Bansal,
                                      D., and D. Thaler, "Compound TCP:
                                      A New TCP Congestion Control for
                                      High-Speed and Long Distance
                                      Networks",
                                      draft-sridharan-tcpm-ctcp-02 (work
                                      in progress), November 2008.

   [IntDesPrinciples]                 Clark, D., "The Design Philosophy
                                      of the DARPA Internet Protocols",
                                      ACM SIGCOMM CCR 18(4)106--114,
                                      August 1988, <http://www.acm.org/
                                      sigcomm/ccr/archive/1995/jan95/
                                      ccr-9501-clark.pdf>.

   [RFC0791]                          Postel, J., "Internet Protocol",
                                      STD 5, RFC 791, September 1981.

   [RFC2309]                          Braden, B., Clark, D., Crowcroft,
                                      J., Davie, B., Deering, S.,
                                      Estrin, D., Floyd, S., Jacobson,
                                      V., Minshall, G., Partridge, C.,
                                      Peterson, L., Ramakrishnan, K.,
                                      Shenker, S., Wroclawski, J., and
                                      L. Zhang, "Recommendations on
                                      Queue Management and Congestion

Mathis & Briscoe       Expires September 13, 2012              [Page 25]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

                                      Avoidance in the Internet",
                                      RFC 2309, April 1998.

   [RFC3168]                          Ramakrishnan, K., Floyd, S., and
                                      D. Black, "The Addition of
                                      Explicit Congestion Notification
                                      (ECN) to IP", RFC 3168,
                                      September 2001.

   [RFC3448]                          Handley, M., Floyd, S., Padhye,
                                      J., and J. Widmer, "TCP Friendly
                                      Rate Control (TFRC): Protocol
                                      Specification", RFC 3448,
                                      January 2003.

   [RFC3514]                          Bellovin, S., "The Security Flag
                                      in the IPv4 Header", RFC 3514,
                                      April 1 2003.

   [RFC3540]                          Spring, N., Wetherall, D., and D.
                                      Ely, "Robust Explicit Congestion
                                      Notification (ECN) Signaling with
                                      Nonces", RFC 3540, June 2003.

   [RFC3550]                          Schulzrinne, H., Casner, S.,
                                      Frederick, R., and V. Jacobson,
                                      "RTP: A Transport Protocol for
                                      Real-Time Applications", STD 64,
                                      RFC 3550, July 2003.

   [RFC5670]                          Eardley, P., "Metering and Marking
                                      Behaviour of PCN-Nodes", RFC 5670,
                                      November 2009.

   [RFC5681]                          Allman, M., Paxson, V., and E.
                                      Blanton, "TCP Congestion Control",
                                      RFC 5681, September 2009.

   [Re-fb]                            Briscoe, B., Jacquet, A., Di
                                      Cairano-Gilfedder, C., Salvatori,
                                      A., Soppera, A., and M. Koyabe,
                                      "Policing Congestion Response in
                                      an Internetwork Using Re-
                                      Feedback", ACM SIGCOMM
                                      CCR 35(4)277--288, August 2005, <h
                                      ttp://www.acm.org/sigs/sigcomm/
                                      sigcomm2005/
                                      techprog.html#session8>.

Mathis & Briscoe       Expires September 13, 2012              [Page 26]
Internet-Draft    ConEx Concepts and Abstract Mechanism       March 2012

   [Refb-dis]                         Briscoe, B., "Re-feedback: Freedom
                                      with Accountability for Causing
                                      Congestion in a Connectionless
                                      Internetwork", UCL PhD
                                      Dissertation , 2009, <http://
                                      bobbriscoe.net/projects/
                                      refb/#refb-dis>.

   [Salvatori05]                      Salvatori, A., "Closed Loop
                                      Traffic Policing", Politecnico
                                      Torino and Institut Eurecom
                                      Masters Thesis , September 2005.

   [Vegas]                            Brakmo, L. and L. Peterson, "TCP
                                      Vegas: End-to-End Congestion
                                      Avoidance on a Global Internet",
                                      IEEE Journal on Selected Areas in
                                      Communications 13(8)1465--80,
                                      October 1995, <http://
                                      ieeexplore.ieee.org/iel1/49/9740/
                                      00464716.pdf?arnumber=464716>.

Authors' Addresses

   Matt Mathis
   Google, Inc
   1600 Amphitheater Parkway
   Mountain View, California  93117
   USA

   EMail: mattmathis at google.com

   Bob Briscoe
   BT
   B54/77, Adastral Park
   Martlesham Heath
   Ipswich  IP5 3RE
   UK

   Phone: +44 1473 645196
   EMail: bob.briscoe@bt.com
   URI:   http://bobbriscoe.net/

Mathis & Briscoe       Expires September 13, 2012              [Page 27]
Congestion Exposure (ConEx) Concepts and Abstract Mechanism draft-ietf-conex-abstract-mech-04

Congestion Exposure (ConEx) Concepts and Abstract Mechanism
draft-ietf-conex-abstract-mech-04