DNS Stateful Operations
draft-ietf-dnsop-session-signal-17
The information below is for an old version of the document.
Document | Type |
This is an older version of an Internet-Draft that was ultimately published as RFC 8490.
|
|
---|---|---|---|
Authors | Ray Bellis , Stuart Cheshire , John Dickinson , Sara Dickinson , Ted Lemon , Tom Pusateri | ||
Last updated | 2018-10-23 (Latest revision 2018-09-26) | ||
Replaces | draft-bellis-dnsop-session-signal | ||
RFC stream | Internet Engineering Task Force (IETF) | ||
Formats | |||
Reviews |
GENART Telechat review
(of
-11)
by Joel Halpern
Ready w/nits
GENART Last Call review
(of
-10)
by Joel Halpern
Ready w/issues
|
||
Additional resources | Mailing list discussion | ||
Stream | WG state | Submitted to IESG for Publication | |
Document shepherd | Tim Wicinski | ||
Shepherd write-up | Show Last changed 2018-05-15 | ||
IESG | IESG state | Became RFC 8490 (Proposed Standard) | |
Consensus boilerplate | Yes | ||
Telechat date |
(None)
Needs 9 more YES or NO OBJECTION positions to pass. |
||
Responsible AD | Warren "Ace" Kumari | ||
Send notices to | "Tim Wicinski" <tjw.ietf@gmail.com> | ||
IANA | IANA review state | Version Changed - Review Needed |
draft-ietf-dnsop-session-signal-17
9.3. Connection Sharing As previously specified for DNS over TCP [RFC7766]: To mitigate the risk of unintentional server overload, DNS clients MUST take care to minimize the number of concurrent TCP connections made to any individual server. It is RECOMMENDED that for any given client/server interaction there SHOULD be no more than one connection for regular queries, one for zone transfers, and one for each protocol that is being used on top of TCP (for example, if the resolver was using TLS). However, it is noted that certain primary/secondary configurations with many busy zones might need to use more than one TCP connection for zone transfers for operational reasons (for example, to support concurrent transfers of multiple zones). A single server may support multiple services, including DNS Updates [RFC2136], DNS Push Notifications [I-D.ietf-dnssd-push], and other services, for one or more DNS zones. When a client discovers that the target server for several different operations is the same service instance (see Section 9.1), the client SHOULD use a single shared DSO Session for all those operations. This requirement has two benefits. First, it reduces unnecessary connection load on the DNS server. Second, it avoids paying the TCP slow start penalty when making subsequent connections to the same server. However, server implementers and operators should be aware that connection sharing may not be possible in all cases. A single host device may be home to multiple independent client software instances that don't coordinate with each other. Similarly, multiple independent client devices behind the same NAT gateway will also typically appear to the DNS server as different source ports on the same client IP address. Because of these constraints, a DNS server MUST be prepared to accept multiple connections from different source ports on the same client IP address. Bellis, et al. Expires April 26, 2019 [Page 54] Internet-Draft DNS Stateful Operations October 2018 9.4. Operational Considerations for Middlebox Where an application-layer middlebox (e.g., a DNS proxy, forwarder, or session multiplexer) is in the path, care must be taken to avoid a configuration in which DSO traffic is mis-handled. The simplest way to avoid such problems is to avoid using middleboxes. When this is not possible, middleboxes should be evaluated to make sure that they behave correctly. Correct behavior for middleboxes consists of one of: o The middlebox does not forward DSO messages, and responds to DSO messages with a response code other than NOERROR or DSOTYPENI. o The middlebox acts as a DSO server and follows this specification in establishing connections. o There is a 1:1 correspondence between incoming and outgoing connections, such that when a connection is established to the middlebox, it is guaranteed that exactly one corresponding connection will be established from the middlebox to some DNS resolver, and all incoming messages will be forwarded without modification or reordering. An example of this would be a NAT forwarder or TCP connection optimizer (e.g. for a high-latency connection such as a geosynchronous satellite link). Middleboxes that do not meet one of the above criteria are very likely to fail in unexpected and difficult-to-diagnose ways. For example, a DNS load balancer might unbundle DNS messages from the incoming TCP stream and forward each message from the stream to a different DNS server. If such a load balancer is in use, and the DNS servers it points implement DSO and are configured to enable DSO, DSO session establishment will succeed, but no coherent session will exist between the client and the server. If such a load balancer is pointed at a DNS server that does not implement DSO or is configured not to allow DSO, no such problem will exist, but such a configuration risks unexpected failure if new server software is installed which does implement DSO. It is of course possible to implement a middlebox that properly supports DSO. It is even possible to implement one that implements DSO with long-lived operations. This can be done either by maintaining a 1:1 correspondence between incoming and outgoing connections, as mentioned above, or by terminating incoming sessions at the middlebox, but maintaining state in the middlebox about any long-lived that are requested. Specifying this in detail is beyond the scope of this document. Bellis, et al. Expires April 26, 2019 [Page 55] Internet-Draft DNS Stateful Operations October 2018 9.5. TCP Delayed Acknowledgement Considerations Most modern implementations of the Transmission Control Protocol (TCP) include a feature called "Delayed Acknowledgement" [RFC1122]. Without this feature, TCP can be very wasteful on the network. For illustration, consider a simple example like remote login, using a very simple TCP implementation that lacks delayed acks. When the user types a keystroke, a data packet is sent. When the data packet arrives at the server, the simple TCP implementation sends an immediate acknowledgement. Mere milliseconds later, the server process reads the one byte of keystroke data, and consequently the simple TCP implementation sends an immediate window update. Mere milliseconds later, the server process generates the character echo, and sends this data back in reply. The simple TCP implementation then sends this data packet immediately too. In this case, this simple TCP implementation sends a burst of three packets almost instantaneously (ack, window update, data). Clearly it would be more efficient if the TCP implementation were to combine the three separate packets into one, and this is what the delayed ack feature enables. With delayed ack, the TCP implementation waits after receiving a data packet, typically for 200 ms, and then send its ack if (a) more data packet(s) arrive (b) the receiving process generates some reply data, or (c) 200 ms elapses without either of the above occurring. With delayed ack, remote login becomes much more efficient, generating just one packet instead of three for each character echo. The logic of delayed ack is that the 200 ms delay cannot do any significant harm. If something at the other end were waiting for something, then the receiving process should generate the reply that the thing at the end is waiting for, and TCP will then immediately send that reply (and the ack and window update). And if the receiving process does not in fact generate any reply for this particular message, then by definition the thing at the other end cannot be waiting for anything, so the 200 ms delay is harmless. This assumption may be true, unless the sender is using Nagle's algorithm, a similar efficiency feature, created to protect the network from poorly written client software that performs many rapid small writes in succession. Nagle's algorithm allows these small writes to be combined into larger, less wasteful packets. Bellis, et al. Expires April 26, 2019 [Page 56] Internet-Draft DNS Stateful Operations October 2018 Unfortunately, Nagle's algorithm and delayed ack, two valuable efficiency features, can interact badly with each other when used together [NagleDA]. DSO request messages elicit responses; DSO unidirectional messages and DSO response messages do not. For DSO request messages, which do elicit responses, Nagle's algorithm and delayed ack work as intended. For DSO messages that do not elicit responses, the delayed ack mechanism causes the ack to be delayed by 200 ms. The 200 ms delay on the ack can in turn cause Nagle's algorithm to prevent the sender from sending any more data for 200 ms until the awaited ack arrives. On an enterprise GigE backbone with sub-millisecond round-trip times, a 200 ms delay is enormous in comparison. When this issues is raised, there are two solutions that are often offered, neither of them ideal: 1. Disable delayed ack. For DSO messages that elicit no response, removing delayed ack avoids the needless 200 ms delay, and sends back an immediate ack, which tells Nagle's algorithm that it should immediately grant the sender permission to send its next packet. Unfortunately, for DSO messages that *do* elicit a response, removing delayed ack removes the efficiency gains of combining acks with data, and the responder will now send two or three packets instead of one. 2. Disable Nagle's algorithm. When acks are delayed by the delayed ack algorithm, removing Nagle's algorithm prevents the sender from being blocked from sending its next small packet immediately. Unfortunately, on a network with a higher round- trip time, removing Nagle's algorithm removes the efficiency gains of combining multiple small packets into fewer larger ones, with the goal of limiting the number of small packets in flight at any one time. For DSO messages that elicit a response, delayed ack and Nagle's algorithm do the right thing. The problem here is that with DSO messages that elicit no response, the TCP implementation is stuck waiting, unsure if a response is about to be generated, or whether the TCP implementation should go ahead and send an ack and window update. The solution is networking APIs that allow the receiver to inform the TCP implementation that a received message has been read, processed, Bellis, et al. Expires April 26, 2019 [Page 57] Internet-Draft DNS Stateful Operations October 2018 and no response for this message will be generated. TCP can then stop waiting for a response that will never come, and immediately go ahead and send an ack and window update. For implementations of DSO, disabling delayed ack is NOT RECOMMENDED, because of the harm this can do to the network. For implementations of DSO, disabling Nagle's algorithm is NOT RECOMMENDED, because of the harm this can do to the network. At the time that this document is being prepared for publication, it is known that at least one TCP implementation provides the ability for the recipient of a TCP message to signal that it is not going to send a response, and hence the delayed ack mechanism can stop waiting. Implementations on operating systems where this feature is available SHOULD make use of it. Bellis, et al. Expires April 26, 2019 [Page 58] Internet-Draft DNS Stateful Operations October 2018 10. IANA Considerations 10.1. DSO OPCODE Registration The IANA is requested to record the value [TBA1] (tentatively 6) for the DSO OPCODE in the DNS OPCODE Registry. DSO stands for DNS Stateful Operations. 10.2. DSO RCODE Registration The IANA is requested to record the value [TBA2] (tentatively 11) for the DSOTYPENI error code in the DNS RCODE Registry. The DSOTYPENI error code ("DSO-TYPE Not Implemented") indicates that the receiver does implement DNS Stateful Operations, but does not implement the specific DSO-TYPE of the primary TLV in the DSO request message. 10.3. DSO Type Code Registry The IANA is requested to create the 16-bit DSO Type Code Registry, with initial (hexadecimal) values as shown below: +-----------+------------------------+-------+----------+-----------+ | Type | Name | Early | Status | Reference | | | | Data | | | +-----------+------------------------+-------+----------+-----------+ | 0000 | Reserved | NO | Standard | RFC-TBD | | | | | | | | 0001 | KeepAlive | OK | Standard | RFC-TBD | | | | | | | | 0002 | RetryDelay | NO | Standard | RFC-TBD | | | | | | | | 0003 | EncryptionPadding | NA | Standard | RFC-TBD | | | | | | | | 0004-003F | Unassigned, reserved | NO | | | | | for DSO session- | | | | | | management TLVs | | | | | | | | | | | 0040-F7FF | Unassigned | NO | | | | | | | | | | F800-FBFF | Experimental/local use | NO | | | | | | | | | | FC00-FFFF | Reserved for future | NO | | | | | expansion | | | | +-----------+------------------------+-------+----------+-----------+ The meanings of the fields are as follows: Type: the 16-bit DSO type code Bellis, et al. Expires April 26, 2019 [Page 59] Internet-Draft DNS Stateful Operations October 2018 Name: the human-readable name of the TLV Early Data: If OK, this TLV may be sent as early data in a TLS 0-RTT ([RFC8446] Section 2.3) initial handshake. If NA, the TLV may appear as a secondary TLV in a DSO message that is send as early data. Status: IETF Document status (or "External" if not documented in an IETF document. Reference: A stable reference to the document in which this TLV is defined. DSO Type Code zero is reserved and is not currently intended for allocation. Registrations of new DSO Type Codes in the "Reserved for DSO session- management" range 0004-003F and the "Reserved for future expansion" range FC00-FFFF require publication of an IETF Standards Action document [RFC8126]. Any document defining a new TLV which lists a value of "OK" in the 0-RTT column must include a threat analysis for the use of the TLV in the case of TLS 0-RTT. See Section 11.1 for details. Requests to register additional new DSO Type Codes in the "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert Review [RFC8126]. The expert review should validate that the requested type code is specified in a way that conforms to this specification, and that the intended use for the code would not be addressed with an experimental/local assignment. DSO Type Codes in the "experimental/local" range F800-FBFF may be used as Experimental Use or Private Use values [RFC8126] and may be used freely for development purposes, or for other purposes within a single site. No attempt is made to prevent multiple sites from using the same value in different (and incompatible) ways. There is no need for IANA to review such assignments (since IANA does not record them) and assignments are not generally useful for broad interoperability. It is the responsibility of the sites making use of "experimental/local" values to ensure that no conflicts occur within the intended scope of use. 11. Security Considerations If this mechanism is to be used with DNS over TLS, then these messages are subject to the same constraints as any other DNS-over- Bellis, et al. Expires April 26, 2019 [Page 60] Internet-Draft DNS Stateful Operations October 2018 TLS messages and MUST NOT be sent in the clear before the TLS session is established. The data field of the "Encryption Padding" TLV could be used as a covert channel. When designing new DSO TLVs, the potential for data in the TLV to be used as a tracking identifier should be taken into consideration, and should be avoided when not required. When used without TLS or similar cryptographic protection, a malicious entity maybe able to inject a malicious unidirectional DSO Retry Delay Message into the data stream, specifying an unreasonably large RETRY DELAY, causing a denial-of-service attack against the client. The establishment of DSO sessions has an impact on the number of open TCP connections on a DNS server. Additional resources may be used on the server as a result. However, because the server can limit the number of DSO sessions established and can also close existing DSO sessions as needed, denial of service or resource exhaustion should not be a concern. 11.1. TLS 0-RTT Considerations DSO permits zero round-trip operation using TCP Fast Open [RFC7413] with TLS 1.3 [RFC8446] 0-RTT to reduce or eliminate round trips in session establishment. TCP Fast Open is only permitted in combination with TLS 0-RTT. In the rest of this section we refer to TLS 1.3 early data in a TLS 0-RTT initial handshake message that is included in a TCP Fast Open packet as "early data." A DSO message may or may not be permitted to be sent as early data. The definition for each TLV that can be used as a primary TLV is required to state whether or not that TLV is permitted as early data. Only response-requiring messages are ever permitted as early data, and only clients are permitted to send any DSO message as early data, unless there is an implicit session (see Section 5.1). For DSO messages that are permitted as early data, a client MAY include one or more such messages as early data without having to wait for a DSO response to the first DSO request message to confirm successful establishment of a DSO session. However, unless there is an implicit session, a client MUST NOT send DSO unidirectional messages until after a DSO Session has been mutually established. Bellis, et al. Expires April 26, 2019 [Page 61] Internet-Draft DNS Stateful Operations October 2018 Similarly, unless there is an implicit session, a server MUST NOT send DSO request messages until it has received a response-requiring DSO request message from a client and transmitted a successful NOERROR response for that request. Caution must be taken to ensure that DSO messages sent as early data are idempotent, or are otherwise immune to any problems that could be result from the inadvertent replay that can occur with zero round- trip operation. It would be possible to add a TLV that requires the server to do some significant work, and send that to the server as initial data in a TCP SYN packet. A flood of such packets could be used as a DoS attack on the server. None of the TLVs defined here have this property. If a new TLV is specified that does have this property, that TLV must be specified as not permitted in 0-RTT messages. This prevents work from being done until a round-trip has occurred from the server to the client to verify that the source address of the packet is reachable. Documents that define new TLVs must state whether each new TLV may be sent as early data. Such documents must include a threat analysis in the security considerations section for each TLV defined in the document that may be sent as early data. This threat analysis should be done based on the advice given in [RFC8446] Section 2.3, 8 and Appendix E.5. 12. Acknowledgements Thanks to Stephane Bortzmeyer, Tim Chown, Ralph Droms, Paul Hoffman, Jan Komissar, Edward Lewis, Allison Mankin, Rui Paulo, David Schinazi, Manju Shankar Rao, and Bernie Volz for their helpful contributions to this document. 13. References 13.1. Normative References [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, <https://www.rfc-editor.org/info/rfc1034>. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, November 1987, <https://www.rfc-editor.org/info/rfc1035>. Bellis, et al. Expires April 26, 2019 [Page 62] Internet-Draft DNS Stateful Operations October 2018 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, <https://www.rfc-editor.org/info/rfc1918>. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC2136] Vixie, P., Ed., Thomson, S., Rekhter, Y., and J. Bound, "Dynamic Updates in the Domain Name System (DNS UPDATE)", RFC 2136, DOI 10.17487/RFC2136, April 1997, <https://www.rfc-editor.org/info/rfc2136>. [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms for DNS (EDNS(0))", STD 75, RFC 6891, DOI 10.17487/RFC6891, April 2013, <https://www.rfc-editor.org/info/rfc6891>. [RFC7766] Dickinson, J., Dickinson, S., Bellis, R., Mankin, A., and D. Wessels, "DNS Transport over TCP - Implementation Requirements", RFC 7766, DOI 10.17487/RFC7766, March 2016, <https://www.rfc-editor.org/info/rfc7766>. [RFC7830] Mayrhofer, A., "The EDNS(0) Padding Option", RFC 7830, DOI 10.17487/RFC7830, May 2016, <https://www.rfc-editor.org/info/rfc7830>. [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, <https://www.rfc-editor.org/info/rfc8126>. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. 13.2. Informative References [I-D.ietf-dnsop-no-response-issue] Andrews, M. and R. Bellis, "A Common Operational Problem in DNS Servers - Failure To Respond.", draft-ietf-dnsop- no-response-issue-11 (work in progress), July 2018. Bellis, et al. Expires April 26, 2019 [Page 63] Internet-Draft DNS Stateful Operations October 2018 [I-D.ietf-dnssd-mdns-relay] Lemon, T. and S. Cheshire, "Multicast DNS Discovery Relay", draft-ietf-dnssd-mdns-relay-01 (work in progress), July 2018. [I-D.ietf-dnssd-push] Pusateri, T. and S. Cheshire, "DNS Push Notifications", draft-ietf-dnssd-push-15 (work in progress), September 2018. [I-D.ietf-doh-dns-over-https] Hoffman, P. and P. McManus, "DNS Queries over HTTPS (DoH)", draft-ietf-doh-dns-over-https-14 (work in progress), August 2018. [I-D.ietf-dprive-padding-policy] Mayrhofer, A., "Padding Policy for EDNS(0)", draft-ietf- dprive-padding-policy-06 (work in progress), July 2018. [NagleDA] Cheshire, S., "TCP Performance problems caused by interaction between Nagle's Algorithm and Delayed ACK", May 2005, <http://www.stuartcheshire.org/papers/nagledelayedack/>. [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - Communication Layers", STD 3, RFC 1122, DOI 10.17487/RFC1122, October 1989, <https://www.rfc-editor.org/info/rfc1122>. [RFC2132] Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor Extensions", RFC 2132, DOI 10.17487/RFC2132, March 1997, <https://www.rfc-editor.org/info/rfc2132>. [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, RFC 5382, DOI 10.17487/RFC5382, October 2008, <https://www.rfc-editor.org/info/rfc5382>. [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, DOI 10.17487/RFC6762, February 2013, <https://www.rfc-editor.org/info/rfc6762>. [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, <https://www.rfc-editor.org/info/rfc6763>. Bellis, et al. Expires April 26, 2019 [Page 64] Internet-Draft DNS Stateful Operations October 2018 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, <https://www.rfc-editor.org/info/rfc7413>. [RFC7828] Wouters, P., Abley, J., Dickinson, S., and R. Bellis, "The edns-tcp-keepalive EDNS0 Option", RFC 7828, DOI 10.17487/RFC7828, April 2016, <https://www.rfc-editor.org/info/rfc7828>. [RFC7857] Penno, R., Perreault, S., Boucadair, M., Ed., Sivakumar, S., and K. Naito, "Updates to Network Address Translation (NAT) Behavioral Requirements", BCP 127, RFC 7857, DOI 10.17487/RFC7857, April 2016, <https://www.rfc-editor.org/info/rfc7857>. [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., and P. Hoffman, "Specification for DNS over Transport Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 2016, <https://www.rfc-editor.org/info/rfc7858>. [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, <https://www.rfc-editor.org/info/rfc8446>. Authors' Addresses Ray Bellis Internet Systems Consortium, Inc. 950 Charter Street Redwood City CA 94063 USA Phone: +1 (650) 423-1200 Email: ray@isc.org Stuart Cheshire Apple Inc. One Apple Park Way Cupertino CA 95014 USA Phone: +1 (408) 996-1010 Email: cheshire@apple.com Bellis, et al. Expires April 26, 2019 [Page 65] Internet-Draft DNS Stateful Operations October 2018 John Dickinson Sinodun Internet Technologies Magadalen Centre Oxford Science Park Oxford OX4 4GA United Kingdom Email: jad@sinodun.com Sara Dickinson Sinodun Internet Technologies Magadalen Centre Oxford Science Park Oxford OX4 4GA United Kingdom Email: sara@sinodun.com Ted Lemon Nibbhaya Consulting P.O. Box 958 Brattleboro VT 05302-0958 USA Email: mellon@fugue.com Tom Pusateri Unaffiliated Raleigh NC 27608 USA Phone: +1 (919) 867-1330 Email: pusateri@bangj.com Bellis, et al. Expires April 26, 2019 [Page 66]