DNS Operations                                                 M. Larson
Internet-Draft                                                 P. Barber
Expires: January 17, 2005                                       VeriSign
                                                           July 19, 2004



                  Observed DNS Resolution Misbehavior
                    draft-ietf-dnsop-bad-dns-res-02


Status of this Memo


   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.


   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.


   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."


   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.


   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


   This Internet-Draft will expire on January 17, 2005.


Copyright Notice


   Copyright (C) The Internet Society (2004).  All Rights Reserved.


Abstract


   This memo describes DNS name server and resolver behavior that
   results in a significant query volume sent to the root and top-level
   domain (TLD) name servers.  In some cases we recommend minor
   additions to the DNS protocol specification and corresponding changes
   in iterative resolver implementations to alleviate these unnecessary
   queries.  The recommendations made in this document are a direct
   byproduct of observation and analysis of abnormal query traffic
   patterns seen at two of the thirteen root name servers and all
   thirteen com/net TLD name servers.


   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",




Larson & Barber         Expires January 17, 2005                [Page 1]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [1].


Table of Contents


   1.   Introduction . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1  A note about terminology in this memo  . . . . . . . . . .   3
   2.   Observed name server misbehavior . . . . . . . . . . . . . .   5
     2.1  Aggressive requerying for delegation information . . . . .   5
       2.1.1  Recommendation . . . . . . . . . . . . . . . . . . . .   6
     2.2  Repeated queries to lame servers . . . . . . . . . . . . .   6
       2.2.1  Recommendation . . . . . . . . . . . . . . . . . . . .   7
     2.3  Inability to follow multiple levels of out-of-zone glue  .   7
       2.3.1  Recommendation . . . . . . . . . . . . . . . . . . . .   8
     2.4  Aggressive retransmission when fetching glue . . . . . . .   8
       2.4.1  Recommendation . . . . . . . . . . . . . . . . . . . .   9
     2.5  Aggressive retransmission behind firewalls . . . . . . . .   9
       2.5.1  Recommendation . . . . . . . . . . . . . . . . . . . .   9
     2.6  Misconfigured NS records . . . . . . . . . . . . . . . . .  10
       2.6.1  Recommendation . . . . . . . . . . . . . . . . . . . .  11
     2.7  Name server records with zero TTL  . . . . . . . . . . . .  11
       2.7.1  Recommendation . . . . . . . . . . . . . . . . . . . .  12
     2.8  Unnecessary dynamic update messages  . . . . . . . . . . .  12
       2.8.1  Recommendation . . . . . . . . . . . . . . . . . . . .  13
     2.9  Queries for domain names resembling IP addresses . . . . .  13
       2.9.1  Recommendation . . . . . . . . . . . . . . . . . . . .  13
     2.10   Misdirected recursive queries  . . . . . . . . . . . . .  14
       2.10.1   Recommendation . . . . . . . . . . . . . . . . . . .  14
     2.11   Suboptimal name server selection algorithm . . . . . . .  14
       2.11.1   Recommendation . . . . . . . . . . . . . . . . . . .  15
   3.   IANA considerations  . . . . . . . . . . . . . . . . . . . .  16
   4.   Security considerations  . . . . . . . . . . . . . . . . . .  17
   5.   Internationalization considerations  . . . . . . . . . . . .  18
   6.   Normative References . . . . . . . . . . . . . . . . . . . .  18
        Authors' Addresses . . . . . . . . . . . . . . . . . . . . .  18
        Intellectual Property and Copyright Statements . . . . . . .  20
















Larson & Barber         Expires January 17, 2005                [Page 2]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



1.  Introduction


   Observation of query traffic received by two root name servers and
   the thirteen com/net TLD name servers has revealed that a large
   proportion of the total traffic often consists of "requeries".  A
   requery is the same question (<qname, qtype, qclass>) asked
   repeatedly at an unexpectedly high rate.  We have observed requeries
   from both a single IP address and multiple IP addresses (i.e., the
   same query received simultaneously from multiple IP addresses).


   By analyzing requery events we have found that the cause of the
   duplicate traffic is almost always a deficient iterative resolver,
   stub resolver and/or application implementation combined with an
   operational anomaly.  The implementation deficiencies we have
   identified to date include well-intentioned recovery attempts gone
   awry, insufficient caching of failures, early abort when multiple
   levels of glue records must be followed, and aggressive retry by stub
   resolvers and/or applications.  Anomalies that we have seen trigger
   requery events include lame delegations, unusual glue records, and
   anything that makes all authoritative name servers for a zone
   unreachable (DoS attacks, crashes, maintenance, routing failures,
   congestion, etc.).


   In the following sections, we provide a detailed explanation of the
   observed behavior and recommend changes that will reduce the requery
   rate.  Some of the changes recommended affect the core DNS protocol
   specification, described principally in RFC 1034 [2], RFC 1035 [3]
   and RFC 2181 [4].


1.1  A note about terminology in this memo


   To recast an old saying about standards, the nice thing about DNS
   terms is that there are so many of them to choose from.  Writing or
   talking about DNS can be difficult and cause confusion resulting from
   a lack of agreed-upon terms for its various components.  Further
   complicating matters are implementations that combine multiple roles
   into one piece of software, which makes naming the result
   problematic.  An example is the entity that accepts recursive
   queries, issues iterative queries as necessary to resolve them,
   caches responses it receives, and which is also able answer questions
   about certain zones authoritatively.  Often called a "recursive name
   server" or a "caching name server", it is in fact an iterative
   resolver combined with an authoritative name server.


   This memo is concerned principally with the behavior of iterative
   resolvers, which are typically found as part of a recursive name
   server.  This memo uses the more precise term "iterative resolver",
   because the focus is usually on that component, rather than the more




Larson & Barber         Expires January 17, 2005                [Page 3]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   general term "recursive name server".


   The advent of IPv6 requires mentioning AAAA records as well as A
   records when discussing glue.  To avoid continuous repetition and
   qualification, this memo uses the general term "address records" to
   encompass both A and AAAA records when a particular situation is
   relevant to both types.













































Larson & Barber         Expires January 17, 2005                [Page 4]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



2.  Observed name server misbehavior


2.1  Aggressive requerying for delegation information


   There can be times when every name server in a zone's NS RRset is
   unreachable (e.g., during a network outage), unavailable (e.g., the
   name server process is not running on the server host) or
   misconfigured (e.g., the name server is not authoritative for the
   given zone, also known as "lame").  Consider an iterative resolver
   that attempts to resolve a query for a domain name in such a zone and
   discovers that none of the zone's name servers can provide an answer.
   We have observed an iterative resolver implementation that then
   verifies the zone's NS RRset in its cache by querying for the zone's
   delegation information: it sends a query for the zone's NS RRset to
   one of the parent zone's name servers.


   For example, suppose that "example.com" has the following NS RRset:


     example.com.   IN   NS   ns1.example.com.
     example.com.   IN   NS   ns2.example.com.


   Upon receipt of a query for "www.example.com" and assuming that
   neither "ns1.example.com" nor "ns2.example.com" can provide an
   answer, this iterative resolver implementation immediately queries a
   "com" zone name server for the "example.com" NS RRset to verify it
   has the proper delegation information.  This name server
   implementation performs this query to a zone's parent zone for each
   recursive query it receives that fails because of a completely
   unresponsive set of name servers for the target zone.  Consider the
   effect when a popular zone experiences a catastrophic failure of all
   its name servers: now every recursive query for domain names in that
   zone sent to this name server implementation results in a query to
   the failed zone's parent name servers.  On one occasion when several
   dozen popular zones became unreachable, the query load on the com/net
   name servers increased by 50%.


   We believe this verification query is not reasonable.  Consider the
   circumstances: When an iterative resolver is resolving a query for a
   domain name in a zone it has not previously searched, it uses the
   list of name servers in the referral from the target zone's parent.
   If on its first attempt to search the target zone, none of the name
   servers in the referral is reachable, a verification query to the
   parent is pointless: this query to the parent would come so quickly
   on the heels of the referral that it would be almost certain to
   contain the same list of name servers.  The chance of discovering any
   new information is slim.


   The other possibility is that the iterative resolver successfully




Larson & Barber         Expires January 17, 2005                [Page 5]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   contacts one of the target zone's name servers and then caches the NS
   RRset from the authority section of a response, the proper behavior
   according to section 5.4.1 of RFC 2181 [4], because the NS RRset from
   the target zone is more trustworthy than delegation information from
   the parent zone.  If, while processing a subsequent recursive query,
   the recursing name server discovers that none of the name servers
   specified in the cached NS RRset is available or authoritative,
   querying the parent would be wrong.  An NS RRset from the parent zone
   would now be less trustworthy than data already in the cache.


   For this query of the parent zone to be useful, the target zone's
   entire set of name servers would have to change AND the former set of
   name servers would have to be deconfigured and/or decommissioned AND
   the delegation information in the parent zone would have to be
   updated with the new set of name servers, all within the TTL of the
   target zone's NS RRset.  We believe this scenario is uncommon:
   administrative best practices dictate that changes to a zone's set of
   name servers happen gradually, with servers that are removed from the
   NS RRset left authoritative for the zone as long as possible.  The
   scenarios that we can envision that would benefit from the parent
   requery behavior do not outweigh its damaging effects.


2.1.1  Recommendation


   Name servers offering recursion MUST NOT send a query for the NS
   RRset of a non-responsive zone to any of the name servers for that
   zone's parent zone.  For the purposes of this injunction, a
   non-responsive zone is defined as a zone for which every name server
   listed in the zone's NS RRset:
   1.  is not authoritative for the zone (i.e., lame), or,
   2.  returns a server failure response (RCODE=2), or,
   3.  is dead or unreachable according to section 7.2 of RFC 2308 [5].


2.2  Repeated queries to lame servers


   Section 2.1 describes a catastrophic failure: when every name server
   for a zone is unable to provide an answer for one reason or another.
   A more common occurrence is a subset of a zone's name servers being
   unavailable or misconfigured.  Different failure modes have different
   expected durations.  Some symptoms indicate problems that are
   potentially transient: various types of ICMP unreachable messages
   because a name server process is not running or a host or network is
   unreachable, or a complete lack of a response to a query.  Such
   responses could be the result of a host rebooting or temporary
   outages; these events don't necessarily require any human
   intervention and can be reasonably expected to be temporary.


   Other symptoms clearly indicate a condition requiring human




Larson & Barber         Expires January 17, 2005                [Page 6]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   intervention, such as lame server: if a name server is misconfigured
   and not authoritative for a zone delegated to it, it is reasonable to
   assume that this condition has potential to last longer than
   unreachability or unresponsiveness.  Consequently, repeated queries
   to known lame servers are not useful.  In this case of a condition
   with potential to persist for a long time, a better practice would be
   to maintain a list of known lame servers and avoid querying them
   repeatedly in a short interval.


2.2.1  Recommendation


   Iterative resolvers SHOULD cache name servers that they discover are
   not authoritative for zones delegated to them (i.e.  lame servers).
   Lame servers MUST be cached against the specific query tuple <zone
   name, class, server IP address>.  Zone name can be derived from the
   owner name of the NS record that was referenced to query the name
   server that was discovered to be lame.  Implementations that perform
   lame server caching MUST refrain from sending queries to known lame
   servers based on a time interval from when the server is discovered
   to be lame.  A minimum interval of thirty minutes is RECOMMENDED.


2.3  Inability to follow multiple levels of out-of-zone glue


   Some iterative resolver implementations are unable to follow more
   than one level of out-of-zone glue.  For example, consider the
   following delegations:


     foo.example.        IN   NS   ns1.example.com.
     foo.example.        IN   NS   ns2.example.com.


     example.com.        IN   NS   ns1.test.example.net.
     example.com.        IN   NS   ns2.test.example.net.


     test.example.net.   IN   NS   ns1.test.example.net.
     test.example.net.   IN   NS   ns2.test.example.net.


   A name server processing a recursive query for "www.foo.example" must
   follow two levels of indirection, first obtaining address records for
   "ns1.test.example.net" and/or "ns2.test.example.net" in order to
   obtain address records for "ns1.example.com" and/or "ns2.example.com"
   in order to query those name servers for the address records of
   "www.foo.example".  While this situation may appear contrived, we
   have seen multiple similar occurrences and expect more as new generic
   top-level domains (gTLDs) become active.  We anticipate many zones in
   the new gTLDs will use name servers in other gTLDs, increasing the
   amount of inter-zone glue.






Larson & Barber         Expires January 17, 2005                [Page 7]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



2.3.1  Recommendation


   Clearly constructing a delegation that relies on multiple levels of
   out-of-zone glue is not a good administrative practice.  This issue
   could be mitigated with an operational injunction in an RFC to
   refrain from construction of such delegations.  In our opinion the
   practice is widespread enough to merit clarifications to the DNS
   protocol specification to permit it on a limited basis.


   Name servers offering recursion SHOULD be able to handle at least
   three levels of indirection resulting from out-of-zone glue.


2.4  Aggressive retransmission when fetching glue


   When an authoritative name server responds with a referral, it
   includes NS records in the authority section of the response.
   According to the algorithm in section 4.3.2 of RFC 1034 [2], the name
   server should also "put whatever addresses are available into the
   additional section, using glue RRs if the addresses are not available
   from authoritative data or the cache."  Some name server
   implementations take this address inclusion a step further with a
   feature called "glue fetching".  A name server that implements glue
   fetching attempts to include A records for every NS record in the
   authority section.  If necessary, the name server issues multiple
   queries of its own to obtain any missing address records.


   Problems with glue fetching can arise in the context of
   "authoritative-only" name servers, which only serve authoritative
   data and ignore requests for recursion.  Such a server will not
   generate any queries of its own.  Instead it answers non-recursive
   queries from resolvers looking for information in zones it serves.
   With glue fetching enabled, however, an authoritative server will
   generate queries whenever it needs to look up an unknown address
   record to complete the additional section of a response.


   We have observed situations where a glue-fetching name server can
   send queries that reach other name servers, but apparently is
   prevented from receiving the responses.  For example, perhaps the
   name server is authoritative-only and therefore its administrators
   expect it to receive only queries.  Perhaps unaware of glue fetching
   and presuming that the name server will generate no queries, its
   administrators place the name server behind a network device that
   prevents it from receiving responses.  If this is the case, all
   glue-fetching queries will go answered.


   We have observed name server implementations that retry excessively
   when glue-fetching queries are unanswered.  A single com/net name
   server has received hundreds of queries per second from a single name




Larson & Barber         Expires January 17, 2005                [Page 8]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   server.  Judging from the specific queries received and based on
   additional analysis, we believe these queries result from overly
   aggressive glue fetching.


2.4.1  Recommendation


   Implementers whose name servers support glue fetching SHOULD take
   care to avoid sending queries at excessive rates.  Implementations
   SHOULD support throttling logic to detect when queries are sent but
   no responses are received.


2.5  Aggressive retransmission behind firewalls


   A common occurrence and one of the largest sources of repeated
   queries at the com/net and root name servers appears to result from
   resolvers behind misconfigured firewalls.  In this situation, a
   recursive name server is apparently allowed to send queries through a
   firewall to other name servers, but not receive the responses.  The
   result is more queries than necessary because of retransmission, all
   of which are useless because the responses are never received.  Just
   as with the glue-fetching scenario described in Section 2.4, the
   queries are sometimes sent at excessive rates.  To make matters
   worse, sometimes the responses, sent in reply to legitimate queries,
   trigger an alarm on the originator's intrusion detection system.  We
   are frequently contacted by administrators responding to such alarms
   who believe our name servers are attacking their systems.


   Not only do some resolvers in this situation retransmit queries at an
   excessive rate, but they continue to do so for days or even weeks.
   This scenario could result from an organization with multiple
   iterative resolvers, only a subset of whose traffic is improperly
   filtered in this manner.  Stub resolvers in the organization could be
   configured to query multiple name servers.  Consider the case where a
   stub resolver queries a filtered name server first.  This name server
   sends one or more queries whose replies are filtered, so it can't
   respond to the stub resolver, which times out.  The resolver
   retransmits to a name server that is able to provide an answer.
   Since resolution ultimately succeeds the underlying problem might not
   be recognized or corrected.  A popular stub resolver has a very
   aggressive retransmission schedule, including simultaneous queries to
   multiple name servers, which could explain how such a situation could
   persist without being detected.


2.5.1  Recommendation


   The most obvious recommendation is that administrators SHOULD take
   care not to place iterative resolvers behind a firewall that allows
   queries to pass through but not the resulting replies.




Larson & Barber         Expires January 17, 2005                [Page 9]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   Name servers SHOULD take care to avoid sending queries at excessive
   rates.  Implementations SHOULD support throttling logic to detect
   when queries are sent but no responses are received.


2.6  Misconfigured NS records


   Sometimes a zone administrator forgets to add the trailing dot on the
   domain names in the RDATA of a zone's NS records.  Consider this
   fragment of the zone file for "example.com":


     $ORIGIN example.com.
     example.com.      3600   IN   NS   ns1.example.com  ; Note missing
     example.com.      3600   IN   NS   ns2.example.com  ; trailing dots


   The zone's authoritative servers will parse the NS RDATA as
   "ns1.example.com.example.com" and "ns2.example.com.example.com" and
   return NS records with this incorrect RDATA in responses, including
   typically the authority section of every response containing records
   from the "example.com" zone.


   Now consider a typical sequence of queries.  An iterative resolver
   attempting to resolve address records for "www.example.com" with no
   cached information for this zone will query a "com" authoritative
   server.  The "com" server responds with a referral to the
   "example.com" zone, consisting of NS records with valid RDATA and
   associated glue records.  (This example assumes that the
   "example.com" zone information is correct in the "com" zone.)  The
   iterative resolver caches the NS RRset from the "com" server and
   follows the referral by querying one of the "example.com"
   authoritative servers.  This server responds with the
   "www.example.com" address record in the answer section and,
   typically, the "example.com" NS records in the authority section and,
   if space in the message remains, glue address records in the
   additional section.  According to Section 5.4 of RFC 2181 [4], NS
   records in the authority section of an authoritative answer are more
   trustworthy than NS records from the authority section of a
   non-authoritative answer.  Thus the "example.com" NS RRset just
   received from the "example.com" authoritative server displaces the
   "example.com" NS RRset received moments ago from the "com"
   authoritative server.


   But the "example.com" zone contains the erroneous NS RRset as shown
   in the example above.  Subsequent queries for names in "example.com"
   will cause the server to attempt to use the incorrect NS records and
   so the server will try to resolve the nonexistent names
   "ns1.example.com.example.com" and "ns2.example.com.example.com".  In
   this example, since all of the zone's name servers are named in the
   zone itself (i.e., "ns1.example.com.example.com" and




Larson & Barber         Expires January 17, 2005               [Page 10]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   "ns2.example.com.example.com" both end in "example.com") and all are
   bogus, the recursive server cannot reach any "example.com" name
   servers.  Therefore attempts to resolve these names result in address
   record queries to the "com" authoritative servers.  Queries for such
   obviously bogus glue address records occur frequently at the com/net
   name servers.


2.6.1  Recommendation


   An authoritative server can detect this situation.  A trailing dot
   missing from an NS record's RDATA always results by definition in a
   name server name that exists somewhere under the SOA of the zone the
   NS record appears in.  (Note that further levels of delegation are
   possible, so a missing trailing dot could inadvertently create a name
   server name that actually exists in a subzone.)  But in any case, the
   address record must be present in this zone, either as authoritative
   data or glue.


   An authoritative name server SHOULD report an error when one of a
   zone's NS records references a name server below the zone's SOA when
   a corresponding address record does not exist in the zone.


2.7  Name server records with zero TTL


   Sometimes a popular com/net subdomain's zone is configured with a TTL
   of zero on the zone's NS records, which prohibits these records from
   being cached and will result in a higher query volume to the zone's
   authoritative servers.  The zone's administrator should understand
   the consequences of such a configuration and provision resources
   accordingly.  A zero TTL on the zone's NS RRset, however, carries
   additional consequences beyond the zone itself: if a recursive name
   server cannot cache a zone's NS records because of a zero TTL, it
   will be forced to query that zone's parent's name servers each time
   it resolves a name in the zone.  The com/net authoritative servers do
   see an increased query load when a popular com/net subdomain's zone
   is configured with a TTL of zero on the zone's NS records.


   A zero TTL on an RRset expected to change frequently is extreme but
   permissible.  A zone's NS RRset is a special case, however, because
   changes to it must be coordinated with the zone's parent.  In most
   zone parent/child relationships we are aware of, there is typically
   some delay involved in effecting changes.  Further, changes to the
   set of a zone's authoritative name servers (and therefore to the
   zone's NS RRset) are typically relatively rare: providing reliable
   authoritative service requires a reasonably stable set of servers.
   Therefore an extremely low or zero TTL on a zone's NS RRset rarely
   makes sense, except in anticipation of an upcoming change.  In this
   case, when the zone's administrator has planned a change and does not




Larson & Barber         Expires January 17, 2005               [Page 11]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   want recursive name servers throughout the Internet to cache the NS
   RRset for a long period of time, a low TTL is reasonable.


2.7.1  Recommendation


   Because of the additional load placed on a zone's parent's
   authoritative servers imposed by a zero TTL on a zone's NS RRset,
   under such circumstances authoritative name servers SHOULD issue a
   warning when loading a zone or refuse to load the zone altogether.


2.8  Unnecessary dynamic update messages


   The UPDATE message specified in RFC 2136 [6] allows an authorized
   agent to update a zone's data on an authoritative name server using a
   DNS message sent over the network.  Consider the case of an agent
   desiring to add a particular resource record.  Because of zone cuts,
   the agent does not necessarily know the proper zone to which the
   record should be added.  The dynamic update process requires that the
   agent determine the appropriate zone so the UPDATE message can be
   sent to one of the zone's authoritative servers (typically the
   primary master as specified in the zone's SOA MNAME field).


   The appropriate zone to update is the closest enclosing zone, which
   cannot be determined only by inspecting the domain name of the record
   to be updated, since zone cuts can occur anywhere.  One way to
   determine the closest enclosing zone entails walking up the name
   space tree by sending repeated UPDATE messages until success.  For
   example, consider an agent attempting to add an address record with
   the name "foo.bar.example.com".  The agent could first attempt to
   update the "foo.bar.example.com" zone.  If the attempt failed, the
   update could be directed to the "bar.example.com" zone, then the
   "example.com" zone, then the "com" zone, and finally the root zone.


   A popular dynamic agent follows this algorithm.  The result is many
   UPDATE messages received by the root name servers, the com/net
   authoritative servers, and presumably other TLD authoritative
   servers.  A valid question is why the algorithm proceeds to send
   updates all the way to TLD and root name servers.  This behavior is
   not entirely unreasonable: in enterprise DNS architectures with an
   "internal root" design, there could conceivably be private,
   non-public TLD or root zones that would be the appropriate targets
   for a dynamic update.


   A significant deficiency with this algorithm is that knowledge of a
   given UPDATE message's failure is not helpful in directing future
   UPDATE messages to the appropriate servers.  A better algorithm would
   be to find the closest enclosing zone by walking up the name space
   with queries for SOA or NS rather than "probing" with UPDATE




Larson & Barber         Expires January 17, 2005               [Page 12]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   messages.  Once the appropriate zone is found, an UPDATE message can
   be sent.  In addition, the results of these queries can be cached to
   aid in determining closest enclosing zones for future updates.  Once
   the closest enclosing zone is determined with this method, the update
   will either succeed or fail and there is no need to send further
   updates to higher-level zones.  The important point is that walking
   up the tree with queries yields cacheable information, whereas
   walking up the tree by sending UPDATE messages does not.


2.8.1  Recommendation


   Dynamic update agents SHOULD send SOA or NS queries to progressively
   higher-level zones to find the closest enclosing zone for a given
   name to update.  Only after the appropriate zone is found should the
   client send an UPDATE message to one of the zone's authoritative
   servers.  Update clients SHOULD NOT "probe" using UPDATE messages by
   walking up the tree to progressively higher-level zones.


2.9  Queries for domain names resembling IP addresses


   The root name servers receive a significant number of A record
   queries where the qname is an IP address.  The source of these
   queries is unknown.  It could be attributed to situations where a
   user believes an application will accept either a domain name or an
   IP address in a given configuration option.  The user enters an IP
   address, but the application assumes any input is a domain name and
   attempts to resolve it, resulting in an A record lookup.  There could
   also be applications that produce such queries in a misguided attempt
   to reverse map IP addresses.


   These queries result in Name Error (RCODE=3) responses.  A recursive
   name server can negatively cache such responses, but each response
   requires a separate cache entry, i.e., a negative cache entry for the
   domain name "192.0.2.1" does not prevent a subsequent query for the
   domain name "192.0.2.2".


2.9.1  Recommendation


   It would be desirable for the root name servers not to have to answer
   these queries: they unnecessarily consume CPU resources and network
   bandwidth.  One possibility is for iterative resolver implementations
   to produce the Name Error response directly.  We suggest that
   implementors consider the option of synthesizing Name Error responses
   at the iterative resolver.  The server could claim authority for
   synthesized TLD zones corresponding to the first octet of every
   possible IP address, e.g.  1., 2., through 255.  This behavior could
   be configurable in the (probably unlikely) event that numeric TLDs
   are ever put into use.




Larson & Barber         Expires January 17, 2005               [Page 13]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   Another option is to delegate these numeric TLDs from the root zone
   to a separate set of servers to absorb the traffic.  The "black hole
   servers" used by the <http://www.as112.net>, which are currently
   delegated the in-addr.arpa zones corresponding to RFC 1918 [7]
   private use address space, would be a possible choice to receive
   these delegations.


2.10  Misdirected recursive queries


   The root name servers receive a significant number of recursive
   queries (i.e., queries with the RD bit set in the header).  Since
   none of the root servers offer recursion, the servers' response in
   such a situation ignores the request for recursion and the response
   probably does not contain the data the querier anticipated.  Some of
   these queries result from users configuring stub resolvers to query a
   root server.  (This situation is not hypothetical: we have received
   complaints from users when this configuration does not work as
   hoped.) Of course, users should not direct stub resolvers to use name
   servers that do not offer recursion, but we are not aware of any stub
   resolver implementation that offers any feedback to the user when so
   configured, aside from simply "not working".


2.10.1  Recommendation


   When the IP address of a (supposedly) iterative resolver is
   configured in a stub resolver using an interactive user interface,
   the resolver could send a test query to verify that the server
   supports recursion (i.e., the response has the RA bit set in the
   header).  The user could be immediately notified if the server is
   non-recursive.


   The stub resolver could also report an error, either through a user
   interface or in a log file, if the queried server does not support
   recursion.  Error reporting SHOULD be throttled to avoid a
   notification or log message for every response from a non-recursive
   server.


2.11  Suboptimal name server selection algorithm


   An entire document could be devoted to the topic of problems with
   different implementations of the recursive resolution algorithm.  The
   entire process of recursion is woefully under specified, requiring
   each implementor to design an algorithm.  Sometimes implementors make
   poor design choices that could be avoided if a suggested algorithm
   and best practices were documented, but that is a topic for another
   document.


   Some deficiencies cause significant operational impact and are




Larson & Barber         Expires January 17, 2005               [Page 14]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   therefore worth mentioning here.  One of these is name server
   selection by an iterative resolver.  When an iterative resolver wants
   to contact one of a zone's authoritative name servers, how does it
   choose from the NS records listed in the zone's NS RRset?  If the
   selection mechanism is suboptimal, queries are not spread evenly
   among a zone's authoritative servers.  The details of the selection
   mechanism are up to the implementor, but we offer some suggestions.


2.11.1  Recommendation


   This list is not conclusive, but reflects the changes that would
   produce the most impact in terms of reducing disproportionate query
   load among a zone's authoritative servers.  I.e., these changes would
   help spread the query load evenly.
   o  Do not make assumptions based on NS RRset order: all NS RRs SHOULD
      be treated equally.  (In the case of the "com" zone, for example,
      most of the root servers return the NS record for
      "a.gtld-servers.net" first in the authority section of referrals.
      As a result, this server receives disproportionately more traffic
      than the other 12 authoritative servers for "com".)
   o  Use all NS records in an RRset.  (For example, we are aware of
      implementations that hard-coded information for a subset of the
      root servers.)
   o  Maintain state and favor the best-performing of a zone's
      authoritative servers.  A good definition of performance is
      response time.  Non-responsive servers can be penalized with an
      extremely high response time.
   o  Do not lock onto the best-performing of a zone's name servers.  An
      iterative resolver SHOULD periodically check the performance of
      all of a zone's name servers to adjust its determination of the
      best-performing one.





















Larson & Barber         Expires January 17, 2005               [Page 15]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



3.  IANA considerations


   There are no new IANA considerations introduced by this memo.

















































Larson & Barber         Expires January 17, 2005               [Page 16]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



4.  Security considerations


   Name server and resolver misbehaviors identical or similar to those
   discussed in this document expose the root and TLD name servers to
   increased risk of both intentional and unintentional denial of
   service.


   We believe that implementation of the recommendations offered in this
   document will reduce the amount of unnecessary traffic seen at root
   and TLD name servers, thus reducing the opportunity for an attacker
   to use such queries to his or her advantage.









































Larson & Barber         Expires January 17, 2005               [Page 17]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



5.  Internationalization considerations


   We do not believe this document introduces any new
   internationalization considerations to the DNS protocol
   specification.


6  Normative References


   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.


   [2]  Mockapetris, P., "Domain names - concepts and facilities", STD
        13, RFC 1034, November 1987.


   [3]  Mockapetris, P., "Domain names - implementation and
        specification", STD 13, RFC 1035, November 1987.


   [4]  Elz, R. and R. Bush, "Clarifications to the DNS Specification",
        RFC 2181, July 1997.


   [5]  Andrews, M., "Negative Caching of DNS Queries (DNS NCACHE)", RFC
        2308, March 1998.


   [6]  Vixie, P., Thomson, S., Rekhter, Y. and J. Bound, "Dynamic
        Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April
        1997.


   [7]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E.
        Lear, "Address Allocation for Private Internets", BCP 5, RFC
        1918, February 1996.



Authors' Addresses


   Matt Larson
   VeriSign, Inc.
   21345 Ridgetop Circle
   Dulles, VA  20166-6503
   USA


   EMail: mlarson@verisign.com











Larson & Barber         Expires January 17, 2005               [Page 18]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   Piet Barber
   VeriSign, Inc.
   21345 Ridgetop Circle
   Dulles, VA  20166-6503
   USA


   EMail: pbarber@verisign.com













































Larson & Barber         Expires January 17, 2005               [Page 19]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



Intellectual Property Statement


   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights.  Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11.  Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.


   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard.  Please address the information to the IETF Executive
   Director.



Full Copyright Statement


   Copyright (C) The Internet Society (2004).  All Rights Reserved.


   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.


   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assignees.


   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION




Larson & Barber         Expires January 17, 2005               [Page 20]


Internet-Draft    Observed DNS Resolution Misbehavior          July 2004



   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



Acknowledgment


   Funding for the RFC Editor function is currently provided by the
   Internet Society.











































Larson & Barber         Expires January 17, 2005               [Page 21]