Shepherd writeup
rfc7765-10

TCP and SCTP RTO Restart - Essay Style Document Writeup

1. Summary

The document shepherd is Michael Scharf <michael.scharf@alcatel-lucent.com>. The responsible Area Director is Martin Stiemerling <mls.ietf@gmail.com>.

This document describes a modified sender-side algorithm for managing the TCP and SCTP retransmission timers that provides faster loss recovery when there is a small amount of outstanding data for a connection.  The modification, RTO Restart (RTOR), allows the transport to restart its retransmission timer so that the effective RTO becomes more aggressive in situations where fast retransmit cannot be used.  This enables faster loss detection and recovery for connections that are short-lived or application-limited.

The TCPM working group requests publication of this document as Experimental RFC to enable and encourage further experimentation. Experiments are needed e.g. to evaluate the tradeoff between performance improvements and the risk of spurious timeouts, as discussed in Section 5 of the document.


2. Review and Consensus

It is the consensus of the TCPM working group to document this alternative algorithm, given the potential performance benefit. The work has mostly been driven by the authors, but the document has been reviewed in detail by several experts and the content has been modified accordingly. Performance experiments in simulations and testbeds have been performed and published by the authors and the experimental results have been reviewed in several TCPM meetings. At the time of writing, there is only limited deployment experience. 

Two issues have been discussed extensively in the working group. First, any reduction of the retransmission timeout duration inherently comes along with a risk of negative impact on TCP performance, e.g. in mobile networks with highly variable RTT. The current understanding is that this risk is low and that the algorithm is conservative and relatively robust, but further experimentation has to confirm this. Second, the Linux operation system uses the "Tail Loss Probe" method discussed in Section 6, which is similar but more complex. This method was not adopted in TCPM since it depends on FACK error recovery method, which has not been standardizes so far.

This document was also last called in TSVWG, since it specifies an algorithm that can be applied both to TCP and SCTP. As a result of WGLC comments the applicability to SCTP has been better explained, including the SCTP API. One issue is that TCP and SCTP use slightly different terminology for comparable concepts. In order to keep the document simple, it was decided not to add another, duplicated description of the algorithm using SCTP terminology. 


3. Intellectual Property

Each author has stated that their direct, personal knowledge of any IPR related to this document has already been disclosed, in conformance with BCPs 78 and 79. There are no IPR disclosures regarding this document.


4. Other Points

None
Back