Observed DNS Resolution Misbehavior
RFC 4697

Note: This ballot was opened for revision 06 and is now closed.

Lars Eggert Yes

(Cullen Jennings; former steering group member) Yes

Yes (2006-08-03)
No email
send info
In section 2.4.1, I'm not sure what I as an implementer would do. Could you give some non normative advice on an example strategy that an implementors might use to archive the goals in this section. Things like suggested throttle rates would be very useful. I know this is hard to find values that everyone agrees too but it is even harder fro implementers.  (I know, I work on some DNS code and I'm clueless what to do here :-)

I really think this is a comment not a discuss but I really encourage you to try and think of some advice to put here. Thanks, Cullen

(David Kessens; former steering group member) Yes

Yes (2006-08-02)
No email
send info
Comments received by Frank Kastenholz from the Ops directorate:

 This document is a bunch of MUST/SHOULD/... for "implementors of
 iterative resolvers". I am sure that the protocol aspects are good
 but it seems to me that this would do nothing to help operators
 now. I mean, an operator can not go and fix a broken resolver...
 Are there things that operators can do (filters? rate-limiters?
 <?>?) Should this note go into that or should there be a separate note?

 Probably not worth holding up the document for this.

(Mark Townsley; former steering group member) (was No Objection) Yes

Yes ()
No email
send info

(Sam Hartman; former steering group member) Yes

Yes ()
No email
send info

(Brian Carpenter; former steering group member) No Objection

No Objection (2006-08-03)
No email
send info
Gen-ART review by Sharon Chisholm with comments by BC and author...

>>1. The document doesn't actually say anywhere that it is informational.
>>> >Looking at recently published informational RFCs (4586 for example), it
>>> >would seem the document needs a 'Category: Informational' in the header
>
>> 
>> Actually it's intended to be a BCP, but the Editor will insert the header.
>
>>> >
>>> >2. In section 1, first paragraph, it refers to 'the thirteen com/net TLD
>>> >name servers' but isn't that just the number at the time of publication?
>>> >Shouldn't we clarify that?


This number is pretty stable: it's been 13 since 1997 and there are no
plans to change it (a statement I can make because I work for the
com/net operator.)  Also, 13 is currently a magic number in this
context, it being the maximum number of name servers possible for a
zone to ensure that the DNS message packet does not overflow the
increasingly historic limit of 512-bytes when transported in UDP.  For
all these reasons, I don't think further clarification would be
necessary.


>>> >3. In section 2.1, on page 5, second paragraph, I think I got a little
>>> >turned around. It is claiming that under the circumstances described
>>> >there is no value in re-querying the parent, since it gave you the bad
>>> >information in the first place, but what about a potentially valid peer
>>> >name server? Couldn't you get that from the parent or is it assumed that
>>> >is already in the cache?  I.e., if ns1 is bad, is ns2 not an option? Are
>>> >these meant to have different zones?
>>> >     example.com.   IN   NS   ns1.example.com.
>>> >     example.com.   IN   NS   ns2.example.com.
>
>> 
>> I understood it to mean that {ns1,ns2} is the complete set and they
>> are both broken, so if you query the parent again you can only
>> get {ns1,ns2} again and they are still broken...


That's correct.  If the parent gives you a list, and you can't contact
anyone in the list, there's no point in asking the parent again any
time soon, since you'll only get the same list again.


>>> >4. Section 2.2 implies that that an implementation can detect a lame
>>> >server and differentiate this case from others. It would be helpful to
>>> >remind implementers how to detect this.


I could add a sentence describing lame server detection.


>>> >5. In section 2.4.1, should implementations also consider detecting that
>>> >they send queries to a particular server and never get responses and
>>> >adapt as a result?


I'm sorry, but I don't understand how your suggestion differs from
what's already written in 2.4.1.

Sharon> I did not see this particular recommendation in section 2.4.1. I just
double checked and it doesn't appear to be there.

>>> >6. In section 2.5.1, the advice is to not be excessive with queries, but
>>> >is this a well understood query rate? 1 per minute or 1 per nanosecond?
>
>> 
>> The same question applies to 2.4.1


I'm not aware of a single rate specification in any DNS RFC.  That's
probably a bad thing, since it leaves too much to the implementor and
makes every implementor figure out the same things (and potentially
make the same mistakes as others).  But that being said, I'm not sure
this document is the place to start.  My intent here is not to set a
hard limit in stone, but to cause the implentor to think carefully
about their implementation's behavior.  I'd also be worried about
picking a hard limit that doesn't work in corner cases that I can't
anticipate.

My preference would be to leave the text as is with general, rather
than specific, guidance.


>>> >7. Section 2.10.1 is the best example in the document of how to write up
>>> >the recommendation section. It clearly states where errors will be
>>> >reported (either through a user interface or a log file) where in most
>>> >other sections it was never specified. It might be interpreted as actual
>>> >protocol warnings and error that could be returned to the client in some
>>> >cases. Sections 2.6.1, 2.7.1, etc should be reworked to be more clear in
>>> >their recommendations.


The exact means of error reporting is implementation dependent.
2.10.1 is different because it specifically references a stub
resolver, which is the portion of the DNS infrastructure closest to
the end user.  In this case, it might make sense and be possible to
report an error through a UI.  In all other cases, it's left to the
implementation.

Sharon> What I would like to see made clear is which of the following is being
recommended in each case (even if it is always the latter)
	- DNS Protocol reporting errors and warnings
	- Applications reporting errors and warnings (this can include
using non DNS-protocols to report the errors (syslog for example).


>>> >8. Section 2.8 starts talking about agents. This appears to be a change
>>> >in terminology. 


It is a change, but it's intentional: 2.8 is the first reference in
the document to a component that sends DNS dynamic update messages.  I
refer to this entity as an agent to distinguish it from DNS resolvers
and servers.


>>> >9. The first paragraph of section 2.11 seems like a good introduction to
>>> >the entire section 2, but seems a bit out of place in its current
>>> >location


Fair enough.  Rather than move it, I think I'd just delete it.  I
don't think it hurts in its current position, however.


>>> >10. Section 2.11.1 mixes problem statement and recommendation in a way
>>> >that is inconsistent with the rest of the document.


I'm not sure that I see mixing problem statement and recommendation,
but I do agree that this recommendation section is different, but
that's because of the multiple recommendations it presents.

Matt

(Dan Romascanu; former steering group member) No Objection

No Objection ()
No email
send info

(Jari Arkko; former steering group member) No Objection

No Objection ()
No email
send info

(Jon Peterson; former steering group member) No Objection

No Objection ()
No email
send info

(Lisa Dusseault; former steering group member) No Objection

No Objection ()
No email
send info

(Ross Callon; former steering group member) No Objection

No Objection ()
No email
send info

(Russ Housley; former steering group member) No Objection

No Objection ()
No email
send info

(Ted Hardie; former steering group member) No Objection

No Objection (2006-07-31)
No email
send info
The document says:

   While this situation may appear contrived, we
   have seen multiple similar occurrences and expect more as new generic
   top-level domains (gTLDs) become active.  We anticipate many zones in
   new gTLDs will use name servers in existing gTLDs, increasing the
   number of delegations using out-of-zone name servers.

I am not sure that there is a reason to limit this to new gTLDs; this certainly
occurs now with some ccTLDs.  It was also at one point standard operating
procedure to use out-of-zone name servers for DNS  at least at the tld level 
(.gtld-servers.net being one reminder of this ".net is for infrastructure" vision
even now).  

I  would personally strengthen the requirement to handle arbitrary levels of indirection,
and I wonder about recommending that one or more of the servers be within 
the zone itself without saying whether best current practice is that some number
be outside the zone.