Last Call Review of draft-ietf-opsawg-large-flow-load-balancing-11
review-ietf-opsawg-large-flow-load-balancing-11-genart-lc-thomson-2014-04-24-00

Request Review of draft-ietf-opsawg-large-flow-load-balancing
Requested rev. no specific revision (document currently at 15)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2014-05-06
Requested 2014-04-24
Authors Ramki Krishnan, Lucy Yong, Anoop Ghanwani, Ning So, Bhumip Khasnabish
Draft last updated 2014-04-24
Completed reviews Genart Last Call review of -11 by Martin Thomson (diff)
Secdir Last Call review of -11 by Yoav Nir (diff)
Secdir Telechat review of -15 by Yoav Nir
Opsdir Last Call review of -11 by Carlos Pignataro (diff)
Opsdir Telechat review of -15 by Carlos Pignataro
Assignment Reviewer Martin Thomson
State Completed
Review review-ietf-opsawg-large-flow-load-balancing-11-genart-lc-thomson-2014-04-24
Reviewed rev. 11 (document currently at 15)
Review result Ready
Review completed: 2014-04-24

Review
review-ietf-opsawg-large-flow-load-balancing-11-genart-lc-thomson-2014-04-24

I am the assigned Gen-ART reviewer for this draft. For background on
Gen-ART, please see the FAQ at

<

http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments
you may receive.

Document: draft-ietf-opsawg-large-flow-load-balancing-11
Reviewer: Martin Thomson
Review Date: 2014-04-24
IETF LC End Date: 2014-05-06
IESG Telechat date: (if known)

Summary: Basically ready.  Looks like a pretty straightforward, even
commonsense, coverage of the ways that flow distribution might be
achieved and a need for it identified.

Major issues: None

Minor issues: Or questions, really.

There's not a lot of discussion about the costs of maintaining an
exception list for rebalanced flows.  A hash-based distribution is
going to cost essentially zero state because the outbound path can be
determined on a per-packet basis, but as soon as you start
redistributing, there is an added state cost (and potential increase
in lookup times).  That probably needs some discussion.

This plays into the security considerations, which I think need to
highlight this as a potential DoS vector.  Implementing automatic
redistribution on top of a hash-based stateless distribution is
vulnerable to attack if the hash function used is predictable.

In Section 5.1 the recommended formula for determining imbalance puts
the average utilization on the divisor, which leads to large
variations in output when the overall utilization is low.  In that
state, it's probably best to avoid redistribution entirely, since no
single link is likely to be close to capacity.  I'd recommend a
simpler formula of max_i(|U_i - U_ave|).  (Note: You are missing an
ellipsis in the calculation of U_ave in the divisor part.)

Nits/editorial comments:

You don't rely on the definition of COTS, which I think is good.  It
can probably go.  There may be others.  I don't tend to check these
things.

Some diagrams have some alignment issues, see Figure 2.

I've never encountered a Fat-Tree before.  So the use of the name
actually hindered comprehension.  It's harmless, but probably
unnecessary.

Section 5.3 essentially includes a copy of Section 4.3.1.

The formulae in Section 5.6.1 are incomprehensible.  I assume that
there is a missing solidus '/' character on each.