Skip to main content

Data Center TCP (DCTCP): TCP Congestion Control for Data Centers
draft-ietf-tcpm-dctcp-10

Revision differences

Document history

Date Rev. By Action
2017-10-16
10 (System) RFC Editor state changed to AUTH48-DONE from AUTH48
2017-10-05
10 (System) RFC Editor state changed to AUTH48 from RFC-EDITOR
2017-10-02
10 (System) RFC Editor state changed to RFC-EDITOR from EDIT
2017-09-15
10 (System) IANA Action state changed to No IC from In Progress
2017-09-15
10 (System) RFC Editor state changed to EDIT
2017-09-15
10 (System) IESG state changed to RFC Ed Queue from Approved-announcement sent
2017-09-15
10 (System) Announcement was received by RFC Editor
2017-09-15
10 (System) IANA Action state changed to In Progress
2017-09-15
10 Amy Vezza IESG state changed to Approved-announcement sent from Approved-announcement to be sent::Point Raised - writeup needed
2017-09-15
10 Amy Vezza IESG has approved the document
2017-09-15
10 Amy Vezza Closed "Approve" ballot
2017-09-15
10 Amy Vezza Ballot approval text was generated
2017-08-28
10 Lars Eggert New version available: draft-ietf-tcpm-dctcp-10.txt
2017-08-28
10 (System) New version approved
2017-08-28
10 (System) Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian
2017-08-28
10 Lars Eggert Uploaded new revision
2017-07-16
09 Lars Eggert New version available: draft-ietf-tcpm-dctcp-09.txt
2017-07-16
09 (System) New version approved
2017-07-16
09 (System) Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian
2017-07-16
09 Lars Eggert Uploaded new revision
2017-06-29
08 Jean Mahoney Request for Last Call review by GENART Completed: Ready with Nits. Reviewer: Orit Levin.
2017-06-27
08 (System) IANA Review state changed to Version Changed - Review Needed from IANA OK - No Actions Needed
2017-06-27
08 Lars Eggert New version available: draft-ietf-tcpm-dctcp-08.txt
2017-06-27
08 (System) New version approved
2017-06-27
08 (System) Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian
2017-06-27
08 Lars Eggert Uploaded new revision
2017-06-22
07 Cindy Morgan IESG state changed to Approved-announcement to be sent::Point Raised - writeup needed from Waiting for Writeup
2017-06-22
07 Benoît Claise
[Ballot comment]
I have not seen any reply to Joe Clarke's OPS DIR review:

Hello, WG and authors.  I have reviewed rev -07 of the …
[Ballot comment]
I have not seen any reply to Joe Clarke's OPS DIR review:

Hello, WG and authors.  I have reviewed rev -07 of the draft-ietf-tcpm-dctcp as
requested by the OPS-DIR.  This review focuses on improving operational aspects
as well as any nits found in the text.

This document is an informational draft that describes Data Center TCP (DCTCP),
a congestion control mechanism for TCP in Data Center environments.

Overall, I believe this document to be ready, with some nits and perhaps small
areas for improved clarity and readability.  First, I'd like to say that I
appreciate the fact that this has been implemented on a number of kernels, and
the authors included real-world implementation results and thoughts.  From an
operational perspective, that is very helpful.  I also appreciated the fact
that there are interoperability challenges, and those were called out in the
document.  My specific comments are below.

There are a lot of abbreviations, variables and other terminology used
throughout this document.  It might be helpful for the reader to have an
expanded terminology section at the top that one can refer to for all of these
things.  Some of the abbreviations are called out in the description of the
algorithm, but not all (e.g., DCTCP.Alpha, CWR, RTT, etc.).

===

Section 3.2:

You refer to DCTCP.Alpha before defining it.  While you refer to Section 3.3
here, the impact of an incorrect Alpha value is not fully appreciated in this
text.  Perhaps this could be changed to reflect the impact the incorrect Alpha
value would have?

===

Section 3.2:

My abbreviating DCTCP.CE as CE in your state machine diagram, it is a bit
confusing as to the difference between CE and DCTCP.CE.  The description of the
state machine above requires the CE codepoint to have a certain value in order
for DCTCP.CE to change.  Perhaps you can use D.CE as an abbreviation to be a
bit clearer here.

===

Section 3.3:

It is not clear if 'g' can be inclusive of 0 and 1.

===

Section 3.3:

You define DCTCP.WindowEnd as the threshold for beginning a new observation
window, but maybe to complement the state variable name, you should define it
as the following:

The TCP sequence number threshold when one observation window ends and other is
to begin; initialized to SND.UNA.

===

Section 3.3:

You state:

Thus, when no bytes sent experienced congestion, DCTCP.Alpha equals
zero, and cwnd is left unchanged

But if I use a value of 1/16 for g, with DCTCP.Alpha initialized to 1 as you
say, I get a value of DCTCP.Alpha == 15/16 when there is no congestion (i.e., M
== 0).

===

Section 3.5:

You have an extra space here before the comma:

If SYN , SYN-ACK and RST packets for DCTCP connections have ECT set

This should be:

If SYN, SYN-ACK and RST packets for DCTCP connections have ECT set

===

Section 3.5:

You do not define ECT before using it.

===

Section 4.1:

Can you provide a reference for NewReno?

===

Section 5:

Can you reference or define AQM and RED?
2017-06-22
07 Benoît Claise [Ballot Position Update] New position, No Objection, has been recorded for Benoit Claise
2017-06-21
07 Alia Atlas [Ballot Position Update] New position, No Objection, has been recorded for Alia Atlas
2017-06-21
07 Kathleen Moriarty [Ballot Position Update] New position, No Objection, has been recorded for Kathleen Moriarty
2017-06-21
07 Adam Roach
[Ballot comment]
Given the nature of this mechanism, I would have expected some qualitative analysis of its performance under typical data center conditions, rather than …
[Ballot comment]
Given the nature of this mechanism, I would have expected some qualitative analysis of its performance under typical data center conditions, rather than the somewhat vague descriptions of it being an "improvement." If the cited literature contains such numbers, I would suggest (a) specifically citing where such data can be found; and (b) copying a very high-level summary into this document (e.g., something like: "Under typical data center load conditions, intra-center transfers of large (muti-gigabyte) files were improved by approximately 12% over Standard TCP using commodity switches in their default configuration. See [REFERENCE] for details.")

Please expand the following acronyms upon first use;
see https://www.rfc-editor.org/materials/abbrev.expansion.txt for guidance.

- L3 - Level 3
- ECT - ECN-Capable Transport
- DSCP - Differentiated Services Code Point
- AQM - Active Queue Management
- RED - Random Early Detection
2017-06-21
07 Adam Roach [Ballot Position Update] New position, No Objection, has been recorded for Adam Roach
2017-06-21
07 Ben Campbell
[Ballot comment]
Substantive Comments:

- General: The purpose of this draft is not clear to me. Is the point to document the Microsoft implementation just …
[Ballot comment]
Substantive Comments:

- General: The purpose of this draft is not clear to me. Is the point to document the Microsoft implementation just for people's information? Do you have hopes other people will implement this? As written, this seems like a case of an informational draft defining protocol. That's not necessarily a problem, but it's helpful to put a paragraph near the beginning to describe why this is being published and what expectations people have of the outcome. (If the answer is along the lines of "We'd like people to implement this so we can get more operational experience", then I will wonder why the status was not "experimental".)

-1, last paragraph: I assume this means that all participants need to live in the datacenter, right? That is, no flows where only one end lives in the datacenter? (I think you clarify that later, but it would be helpful to state it here.)

- 3.3: first paragraph: Why not MUST?
-- "The congestion estimator on the sender SHOULD process acceptable ACKs
  as follows:" Why not MUST?


Nits:

- 1: Can you offer a citation for MapReduce?

- 2: The additional text assumes the usage of 2119 keywords here do not quite map to the 2119 definitions.
-- "but even compliant implementations without the measures in sections 4-6 would still only be safe to deploy in controlled environments.":  That seems too important of a statement to be buried in the terminology section.

- 4.1: Citation for NewReno?
2017-06-21
07 Ben Campbell [Ballot Position Update] New position, No Objection, has been recorded for Ben Campbell
2017-06-21
07 Alissa Cooper [Ballot Position Update] New position, No Objection, has been recorded for Alissa Cooper
2017-06-21
07 Alexey Melnikov [Ballot Position Update] New position, No Objection, has been recorded for Alexey Melnikov
2017-06-20
07 Terry Manderson
[Ballot comment]
Thank you for a well constructed document, and (IMHO) a nice approach to the issue. I noticed a few nits  (some caught by …
[Ballot comment]
Thank you for a well constructed document, and (IMHO) a nice approach to the issue. I noticed a few nits  (some caught by Alvaro) and others that are either me reading late at night or typographical concerns (such as "If SYN , SYN-ACK" [comma placement] first line of section 3.5 - so please give it a thorough read through)
2017-06-20
07 Terry Manderson [Ballot Position Update] New position, Yes, has been recorded for Terry Manderson
2017-06-20
07 Spencer Dawkins
[Ballot comment]
I think Alvaro's nits in his ballot are worth a look, but I'm really glad to see this work moving forward, and wanted …
[Ballot comment]
I think Alvaro's nits in his ballot are worth a look, but I'm really glad to see this work moving forward, and wanted to thank the authors for a clear explanation of a TCP mechanism that I think I could implement myself.
2017-06-20
07 Spencer Dawkins [Ballot Position Update] New position, Yes, has been recorded for Spencer Dawkins
2017-06-20
07 Suresh Krishnan [Ballot Position Update] New position, No Objection, has been recorded for Suresh Krishnan
2017-06-20
07 Alvaro Retana
[Ballot comment]
Several nits:

- The Abstract says that "This memo documents existing DCTCP implementations ([WINDOWS], [LINUX], [FREEBSD])..."  But in reality it doesn't, it just …
[Ballot comment]
Several nits:

- The Abstract says that "This memo documents existing DCTCP implementations ([WINDOWS], [LINUX], [FREEBSD])..."  But in reality it doesn't, it just points to those references that presumably contain implementation information.  The [WINDOWS] reference is only used in the Abstract -- last I looked, there shouldn't be references there [rfc7322].

- "...and deployment experience ([MORGANSTANLEY])."  Again, this draft doesn't document deployment experience, just points at it.

- rfc7942 recommends that the Implementation Status section be removed.  If the intent is to keep it, then consider putting a note so that the RFC Editor doesn't remove it..

- The fact that this document describes the Microsoft Windows Server 2012 implementation should be made clear from the start (in the Introduction).  You could then also get rid of the extra text in Section 2.

- The reference to [RFC3168-ERRATA3639] seems strange to me...not because it is pointing to the report, but because it is Informative, when the reference to RFC3168 is Normative.  I would assume that because Errata3639 has been Verified, then it means it is now "part of" RFC3168, so I would think that there's no need to mention it separately...
2017-06-20
07 Alvaro Retana [Ballot Position Update] New position, No Objection, has been recorded for Alvaro Retana
2017-06-20
07 Deborah Brungard [Ballot Position Update] New position, No Objection, has been recorded for Deborah Brungard
2017-06-19
07 Eric Rescorla [Ballot Position Update] New position, No Objection, has been recorded for Eric Rescorla
2017-06-15
07 Tero Kivinen Request for Last Call review by SECDIR Completed: Has Issues. Reviewer: Catherine Meadows.
2017-06-15
07 Mirja Kühlewind Ballot has been issued
2017-06-15
07 Mirja Kühlewind [Ballot Position Update] New position, Yes, has been recorded for Mirja Kühlewind
2017-06-15
07 Mirja Kühlewind Created "Approve" ballot
2017-06-15
07 Mirja Kühlewind Ballot writeup was changed
2017-06-15
07 Mirja Kühlewind Changed consensus to Yes from Unknown
2017-06-15
07 (System) IESG state changed to Waiting for Writeup from In Last Call
2017-06-09
07 (System) IANA Review state changed to IANA OK - No Actions Needed from IANA - Review Needed
2017-06-09
07 Sabrina Tanamal
(Via drafts-lastcall@iana.org): IESG/Authors/WG Chairs:

The IANA Services Operator has reviewed draft-ietf-tcpm-dctcp-07.txt, which is currently in Last Call, and has the following comments:

We …
(Via drafts-lastcall@iana.org): IESG/Authors/WG Chairs:

The IANA Services Operator has reviewed draft-ietf-tcpm-dctcp-07.txt, which is currently in Last Call, and has the following comments:

We understand that this document doesn't require any registry actions.

While it's often helpful for a document's IANA Considerations section to remain in place upon publication even if there are no actions, if the authors strongly prefer to remove it, we do not object.

If this assessment is not accurate, please respond as soon as possible.

Thank you,

Sabrina Tanamal
IANA Services Specialist
PTI
2017-06-08
07 Joe Clarke Request for Last Call review by OPSDIR Completed: Has Nits. Reviewer: Joe Clarke. Sent review to list.
2017-06-06
07 Gunter Van de Velde Request for Last Call review by OPSDIR is assigned to Joe Clarke
2017-06-06
07 Gunter Van de Velde Request for Last Call review by OPSDIR is assigned to Joe Clarke
2017-06-02
07 Tero Kivinen Request for Last Call review by SECDIR is assigned to Catherine Meadows
2017-06-02
07 Tero Kivinen Request for Last Call review by SECDIR is assigned to Catherine Meadows
2017-06-01
07 Jean Mahoney Request for Last Call review by GENART is assigned to Orit Levin
2017-06-01
07 Jean Mahoney Request for Last Call review by GENART is assigned to Orit Levin
2017-06-01
07 Amy Vezza IANA Review state changed to IANA - Review Needed
2017-06-01
07 Amy Vezza
The following Last Call announcement was sent out:

From: The IESG
To: IETF-Announce
CC: tcpm@ietf.org, Michael Scharf , michael.scharf@nokia.com, draft-ietf-tcpm-dctcp@ietf.org, ietf@kuehlewind.net, …
The following Last Call announcement was sent out:

From: The IESG
To: IETF-Announce
CC: tcpm@ietf.org, Michael Scharf , michael.scharf@nokia.com, draft-ietf-tcpm-dctcp@ietf.org, ietf@kuehlewind.net, tcpm-chairs@ietf.org
Reply-To: ietf@ietf.org
Sender:
Subject: Last Call:  (Datacenter TCP (DCTCP): TCP Congestion Control for Datacenters) to Informational RFC


The IESG has received a request from the TCP Maintenance and Minor
Extensions WG (tcpm) to consider the following document:
- 'Datacenter TCP (DCTCP): TCP Congestion Control for Datacenters'
  as Informational RFC

The IESG plans to make a decision in the next few weeks, and solicits
final comments on this action. Please send substantive comments to the
ietf@ietf.org mailing lists by 2017-06-15. Exceptionally, comments may be
sent to iesg@ietf.org instead. In either case, please retain the
beginning of the Subject line to allow automated sorting.

Abstract


  This informational memo describes Datacenter TCP (DCTCP), a TCP
  congestion control scheme for datacenter traffic.  DCTCP extends the
  Explicit Congestion Notification (ECN) processing to estimate the
  fraction of bytes that encounter congestion, rather than simply
  detecting that some congestion has occurred.  DCTCP then scales the
  TCP congestion window based on this estimate.  This method achieves
  high burst tolerance, low latency, and high throughput with shallow-
  buffered switches.  This memo also discusses deployment issues
  related to the coexistence of DCTCP and conventional TCP, the lack of
  a negotiating mechanism between sender and receiver, and presents
  some possible mitigations.  This memo documents existing DCTCP
  implementations ([WINDOWS], [LINUX], [FREEBSD]) and deployment
  experience ([MORGANSTANLEY]).  DCTCP as described in this draft is
  applicable to deployments in controlled environments like datacenters
  but it must not be deployed over the public Internet without
  additional measures, as detailed in Section 5.




The file can be obtained via
https://datatracker.ietf.org/doc/draft-ietf-tcpm-dctcp/

IESG discussion can be tracked via
https://datatracker.ietf.org/doc/draft-ietf-tcpm-dctcp/ballot/

The following IPR Declarations may be related to this I-D:

  https://datatracker.ietf.org/ipr/2319/





2017-06-01
07 Amy Vezza IESG state changed to In Last Call from Last Call Requested
2017-06-01
07 Mirja Kühlewind Placed on agenda for telechat - 2017-06-22
2017-06-01
07 Mirja Kühlewind Last call was requested
2017-06-01
07 Mirja Kühlewind Ballot approval text was generated
2017-06-01
07 Mirja Kühlewind Ballot writeup was generated
2017-06-01
07 Mirja Kühlewind IESG state changed to Last Call Requested from Publication Requested
2017-06-01
07 Mirja Kühlewind Last call announcement was generated
2017-06-01
07 Lars Eggert New version available: draft-ietf-tcpm-dctcp-07.txt
2017-06-01
07 (System) New version approved
2017-06-01
07 (System) Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian
2017-06-01
07 Lars Eggert Uploaded new revision
2017-05-09
06 Lars Eggert New version available: draft-ietf-tcpm-dctcp-06.txt
2017-05-09
06 (System) New version approved
2017-05-09
06 (System) Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian
2017-05-09
06 Lars Eggert Uploaded new revision
2017-04-24
05 Michael Scharf
1. Summary

The document shepherd is Michael Scharf .

The responsible Area Director is Mirja Kuehlewind .

This informational memo describes Datacenter TCP (DCTCP). DCTCP …
1. Summary

The document shepherd is Michael Scharf .

The responsible Area Director is Mirja Kuehlewind .

This informational memo describes Datacenter TCP (DCTCP). DCTCP is an improvement to TCP congestion control for datacenter traffic that uses Explicit Congestion Notification (ECN). DCTCP as described in this draft is applicable to deployments in controlled environments like datacenters, but it must not be deployed over the public Internet without additional measures. The document is published to document an implementation in the Microsoft Windows Server 2012 operating system. The Linux and FreeBSD operating systems have also implemented support for DCTCP.

Given the limitations of the existing DCTCP specification, which are discussed in the document, the TCPM working group requests publication as informational document.

2. Review and Consensus

The objective of this informational memo is to document an alternative TCP congestion control algorithm that is known to be widely deployed. It is consensus in the TCPM working group that a DCTCP standard would require further work. A precise documentation of running code enables follow-up experimental or standards track RFCs.

The document describes DCTCP as implemented in Microsoft Windows Server 2012. Since the publication of the first versions of the document, the Linux and FreeBSD operating systems have also implemented support for DCTCP. The specification should also enable implementation in other TCP stacks.

The TCPM working group has reviewed the document regarding clarity and comprehensiveness of the protocol specification, e.g. in corner cases. The document has been discussed multiple times in the working group without any major controversy. During the working group last call there have been several detailed reviews, and those comments have been addressed in the most recent version. All in all, there is very strong consensus in the TCPM working group that this document should be published.

3. Intellectual Property

Each author has stated that their direct, personal knowledge of any IPR related to this document has already been disclosed, in conformance with BCPs 78 and 79.

There is an IPR disclosure for the DCTCP protocol specification (https://datatracker.ietf.org/ipr/2319/), which declares "Royalty-Free, Reasonable and Non-Discriminatory License to All Implementers". The TCPM working group is aware of this IPR but there have never been concerns.

4. Other Points

None
2017-04-24
05 Michael Scharf Responsible AD changed to Mirja Kühlewind
2017-04-24
05 Michael Scharf IETF WG state changed to Submitted to IESG for Publication from WG Consensus: Waiting for Write-Up
2017-04-24
05 Michael Scharf IESG state changed to Publication Requested
2017-04-24
05 Michael Scharf IESG process started in state Publication Requested
2017-04-20
05 Michael Scharf Changed document writeup
2017-04-20
05 Michael Scharf IETF WG state changed to WG Consensus: Waiting for Write-Up from In WG Last Call
2017-03-27
05 Lars Eggert New version available: draft-ietf-tcpm-dctcp-05.txt
2017-03-27
05 (System) New version approved
2017-03-27
05 (System) Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian
2017-03-27
05 Lars Eggert Uploaded new revision
2017-02-15
04 Michael Scharf IETF WG state changed to In WG Last Call from WG Document
2017-02-15
04 Michael Scharf Notification list changed to "Michael Scharf" <michael.scharf@nokia.com>
2017-02-15
04 Michael Scharf Document shepherd changed to Michael Scharf
2017-02-07
04 Lars Eggert New version available: draft-ietf-tcpm-dctcp-04.txt
2017-02-07
04 (System) New version approved
2017-02-07
04 (System) Request for posting confirmation emailed to previous authors: "Stephen Bensley" , "Praveen Balasubramanian" , "Dave Thaler" , "Glenn Judd" , "Lars Eggert"
2017-02-07
04 Lars Eggert Uploaded new revision
2016-11-13
03 Lars Eggert New version available: draft-ietf-tcpm-dctcp-03.txt
2016-11-13
03 (System) New version approved
2016-11-13
03 (System) Request for posting confirmation emailed to previous authors: "Stephen Bensley" , "Praveen Balasubramanian" , "Dave Thaler" , "Glenn Judd" , "Lars Eggert"
2016-11-13
03 Lars Eggert Uploaded new revision
2016-09-02
02 Michael Scharf Intended Status changed to Informational from None
2016-07-17
02 Lars Eggert New version available: draft-ietf-tcpm-dctcp-02.txt
2015-11-01
01 Lars Eggert New version available: draft-ietf-tcpm-dctcp-01.txt
2015-09-22
00 Michael Scharf This document now replaces draft-bensley-tcpm-dctcp instead of None
2015-09-22
00 Lars Eggert New version available: draft-ietf-tcpm-dctcp-00.txt