DNS Reverse IP Automatic Multicast Tunneling (AMT) Discovery
draft-ietf-mboned-driad-amt-discovery-13

Note: This ballot was opened for revision 09 and is now closed.

Warren Kumari Yes

Deborah Brungard No Objection

Alissa Cooper No Objection

Roman Danyliw No Objection

Comment (2019-12-17 for -10)
Section 2.2.  Step #1 of the text describing the data flow of Figure 2 references the address 232.252.0.2, but that address isn’t used in Figure 2.

Benjamin Kaduk No Objection

Comment (2019-12-18 for -11)
Thank you for this well-written document!  My comment are all pretty
minor (and many of them are more of a "side note" than actionable
comments, anyway...).

Section 1

   This document updates Section 5.2.3.4 of [RFC7450] by adding a new
   extension to the relay discovery procedure.

I know that there is not a universal usage of "updates", but note that
in other protocols with similar scenarios (multiple possible discovery
methods), the core protocol document is not always in an "Updated by"
relationship with the new discovery methods.  (That said, there seem to
be plenty of other ways in which this document updates RFC 7540, so this
particular instsance isn't a big deal.)

Section 1.2.2

   |     L flag | The "Limit" flag described in Section 5.1.1.4 of     |
   |            | [RFC7450]                                            |

nit: s/5.1.1.4/5.1.4.4/

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   [RFC2119] and [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

nit: this is not quite the prescribed form from RFC 8174.

Section 2.2

Perhaps it's worth a brief note that the UDP unicast tunnel is over IPv6
even though the multicast traffic being conveyed is native IPv4 in both
multicast networks?

Section 2.3.2

      A related motivating example in the sending-side network is
      provided by considering a sender that needs to instruct the
      gateways on how to select between connecting to Figure 6 or
      Figure 7 (from Section 3.2), in order to manage load and failover
      scenarios in a manner that operates well with the sender's
      provisioning strategy for horizontal scaling of AMT relays.

I don't think I understand what "connecting to Figure 6" means.

   Among relay addresses that have an equivalent preference as described
   above, a Happy Eyeballs algorithm for AMT SHOULD use use the
   Destination Address Selection defined in Section 6 of [RFC6724].

   Among relay addresses that still have an equivalent preference after
   the above orderings, a gateway SHOULD make a non-deterministic choice

side note: I think that the way RFC 6724 is written (as a series of
comparison rules), it doesn't have an "are equivalent preference"
option, just a "leave the order unchanged" one.  But that's probably a
useless pedantic distinction and I can't see an actionable change that
would result from it.

Section 2.7

   The DNS query functionality is expected to follow ordinary standards
   and best practices for DNS clients.  A gateway MAY use an existing
   DNS client implementation that does so, and MAY rely on that client's
   retry logic to determine the timeouts between retries.

The first part of this seems to be a duplication of Section 2.4.2, but
the latter part (and following paragraph/sentences) diverges.  There's
probably some room for consolidation and/or harmonization of procedures
here.

Section 4.2.4

(Presumably we ignore entires with unrecognized 'type'; I forget whether
this is standard DNS usage or we should mention it explicitly, though.)

Section 5

Is there any guidance to give to the Designated Experts in addition to
the default rules from RFC 8126?

Section 6

I like that the relay-discovery precedence rules minimize the
opportunity for an attacker to disrupt discovery and try to force a
different relay to be used (whether to afford an opportunity to tamper
with the traffic going to a target recipient or other reasons).  Since
we are updating the generic AMT relay discovery procedure, we could
reasonably mention that (and the generic risks with discovery procedures
that involve attempting to contact a relay and failing over if a timely
response does not appear); RFC 7450's Section 6.2 provides only a
minimal mention.  That said, most of the security considerations
relevant here seem to be ones that apply to stock AMT, and are tolerably
covered in RFC 7450.  I'm a little surprised that Happy Eyeballs doesn't
cover this sort of disruption in its security considerations; I was
going to suggest referencing that as well.

We briefly mention active/active failover in Section 2.3.3, and such a
scheme poses some risk of (additional) traffic duplication around a
failover event, but (1) that can happen with UDP anyway, so it will
already be handled, and (2) it's a pretty tenuous hook to say that we
need to talk about the security considerations of such a situation.

Also on the borderline of worth mentioning, an attacker might attempt to
force a gateway to repeatedly go through the relay discovery process; I
don't think this process is sufficiently resource-intensive that it
would be a usable DoS attack, though, so there's not really much there
other than the generic "disruption" that is already covered in 7450.

Section 6.2

Even though not all of the listed mechanisms are currently specified for
recursive-to-authoritative queries, I think it's fine to list them here,
as they are expected to become defined in the future and would make
sense as options, when available.

   response from the trusted server.  The connection to the trusted
   server can use any secure channel, such as with a TSIG [RFC2845] or
   SIG(0) [RFC2931] channel, a secure local channel on the host, DNS
   over TLS [RFC7858], DNS over HTTPS [RFC8484], or some other mechanism
   that provides authentication of the RR.

I don't think that it's really "authentication" that we're providing for
the RR itself; what we want is more of "source authentication" for the
provenance of the RR (and integrity protection for its contents).

   If an AMT gateway accepts a maliciously crafted AMTRELAY record, the
   result could be a Denial of Service, or receivers processing
   multicast traffic from a source under the attacker's control.

Even for an honest AMTRELAY record, isn't there a chance that the
multicast traffic's contents could also be modified or injected by the
attacker?

Section 8.2

Arguably RFC 8499 would be normative, since we defer to its definition
of FQDN, but I am not really very concerned about it.

Appendix A

     $ ./translate.py amtrelays.example.com
     24
     09616d7472656c617973076578616d706c6503636f6d
   <CODE ENDS>

   The length and the hex string for the domain name
   "amtrelays.example.com" are the outputs of this program, yielding a
   length of 22 and the above hex string.

I'm having a hard time parsing this in a consistent way where the
yielded length is 22 and the literal command output is 24.

(Suresh Krishnan) (was Discuss) No Objection

Comment (2019-12-19 for -11)
Thanks for addressing my DISCUSS and COMMENT.

(Mirja Kühlewind) No Objection

Comment (2019-12-12 for -09)
UPDATE: Actually I saw the replies to the TSV-ART review but the discussed changes where not yet implemented in the current draft version. So I assume that will happen in the nest version! Thanks!

Thanks for addressing the TSV-ART review (and thanks Bernard for the review!)! Based and in addition to that review, I have a few more small comments:

1) Minor wording comment on this point in 2.5.2:
"   8.  When congestion or substantial loss is detected in the stream of
       AMT packets from a relay."
I think is should be "substantial congestion or substantial loss" or just "substantial congestion".
But actually I would even maybe say "substantial and persistent congestion" or maybe even use "(network) overload" instead of "congestion" because I think that what you are actually looking for.

2) And on this sentence in section 2.7 again:
"with a RECOMMENDED initial_timeout
   of 1 second and a RECOMMENDED maximum_timeout of 120 seconds."
Why do you even specify a max value at all? I think it's more common to define an initial value and a max number of retries.

3) I have an additional question on this part in 2.4.2:
"Otherwise, a gateway MUST provide a rate limit
   for the DNS queries, and its default settings MUST NOT permit more
   than 10 queries for any 100-millisecond period (though this MAY be
   overridable by administrative configuration)."
Where do these numbers come from?

4) Editorial: I would recommend to have the examples (section 3) before section 2. And I saw that there is one normative requirement in the examples section (3.2.2.); that's easy to overlook. I would recommend to move it somewhere as or make it not normative.

5) Sec 6.3:
"Application implementors and network operators that use DRIAD-capable
   AMT gateways..."
Thanks for noting this down. However, I'm wondering if DRIAD-capable AMT gateways are in this respective any different that non-DRIAD-capable AMT gateways?
However, I think would be good or actually is needed to point to and discuss section 4.1.4.2.  on Congestion Considerations in RFC7450!
Further regarding the security consideration in general I would also recommend to point to the security considerations section of RFC7450 and double-check if there is no change based on the potential different location of the reply (than assumed in RFC7450).

6) (Important) nit:
In section 2.4.2:
"When present, IP addresses in the initial response provide resolved
   destination address candidates for the "Sorting of resolved
   destination addresses" phase described in Section 4 of [RFC8085]),"
and
"and attempts connections with the corresponding relays
   under the algorithm restrictions and guidelines given in [RFC8085]
   for the "Establishment of one connection, which cancels all other
   attempts" phase."
These should be references to RFC8305 (and not RFC8085).

Barry Leiba No Objection

Comment (2019-12-17 for -10)
— Section 2.2 —

   1.  The end user starts the app, which issues a join to the (S,G):
       (198.51.100.15, 232.252.0.2).

What is 232.252.0.2, and where did it come from?  It would be good for the text to introduce that.


— Section 2.3.2 —

   Among relay addresses that still have an equivalent preference after
   the above orderings, a gateway MUST make a non-deterministic choice
   for relay preference ordering, in order to support load balancing by
   DNS configurations that provide many relay options.

What do you have in mind here?  Random selection?  Something else?  If so, what else that can be assured to be non-deterministic, given the "MUST"?

(Alexey Melnikov) No Objection

Alvaro Retana No Objection

(Adam Roach) No Objection

Comment (2019-12-17 for -10)
Thanks for the work that went into creating this document! This approach to
multicast tunnel establishment sounds quite useful, and I hope it sees broad
adoption.

---------------------------------------------------------------------------

Please expand "AMT" in the title.

---------------------------------------------------------------------------

§2.2:

>                        Figure 2: DRIAD Messaging

This example uses IPv4 rather than IPv6. Please either add a similar diagram
showing IPv6 usage, or change this diagram to use IPv6. See
https://www.iab.org/2016/11/07/iab-statement-on-ipv6/ for further information.

---------------------------------------------------------------------------

§2.6:

>  AMTRELAY records MAY also appear in other
>  zones...

What would this mean? Is this intended for future specifications to
take advantage, or is the document assuming that the reader is able
to figure out the semantics of AMTRELAY RRs elsewhere in the DNS tree?
If the latter, please spell it out explicitly. If the former, please
indicate that using records in this way may be specified in future
documents.

Éric Vyncke No Objection

Comment (2019-12-17 for -10)
Jake,

Thank you for the work put into this document. Quite an achievement for a single author !

Please find below some non-blocking COMMENTs and NITs. I hope that this helps to improve the document,

Regards,

-éric

== COMMENTS ==

-- Section 2.4.2 --
At the end, the text
	"its default settings MUST NOT permit more
   than 10 queries for any 100-millisecond period (though this MAY be
   overridable by administrative configuration)."
should probably use a "SHOULD NOT" rather than a "MUST NOT" as it "MAY" be overridden.

The last paragraph is a little unclear whether the AMT gateway should wait until all DNS replies are received before initiating AMT connection.

-- Section 2.5.3 --
The whole section about tunnel stability has little to do, IMHO, with neither the title of the document "DNS Reverse IP AMT Discovery" nor with the abstract. The content is useful and should perhaps be moved to another companion document. I understand that this is a little late in the process so let's change the abstract at least. Note: I considered balloting a DISCUSS on this issue.

-- Section 3.1.1 --
Please expand CMTS & OLT used in figure 3

-- Section 3.1.2 --
Unsure whether this is a common use-case in 2019 but it is OK (I hope that my ISP was mcast-capable...).

-- Section 4.1 --
Should the code 260 be allocated by IANA? I would rather use 'TBD' in the document and ask for IANA allocation for 'TBD'

== NITS ==

-- Section 2.1 --
s/The sender/The multicast source/ ?

-- Section 4.2.2 --
Please use the canonical format for IPv6 address. I know this is cosmetic but it hurts my eyes ;-)

Magnus Westerlund No Objection