Skip to main content

Address Resolution Problems in Large Data Center Networks
draft-ietf-armd-problem-statement-04

Revision differences

Document history

Date Rev. By Action
2012-10-23
04 Amy Vezza State changed to RFC Ed Queue from Approved-announcement sent
2012-10-22
04 (System) IANA Action state changed to No IC
2012-10-22
04 Cindy Morgan State changed to Approved-announcement sent from Approved-announcement to be sent
2012-10-22
04 Cindy Morgan IESG has approved the document
2012-10-22
04 Cindy Morgan Closed "Approve" ballot
2012-10-22
04 Cindy Morgan Ballot writeup was changed
2012-10-22
04 Cindy Morgan Ballot writeup was changed
2012-10-22
04 Cindy Morgan Ballot approval text was generated
2012-10-22
04 Cindy Morgan Ballot writeup was changed
2012-10-22
04 Ron Bonica State changed to Approved-announcement to be sent from IESG Evaluation::AD Followup
2012-10-22
04 (System) Sub state has been changed to AD Followup from Revised ID Needed
2012-10-22
04 Thomas Narten New version available: draft-ietf-armd-problem-statement-04.txt
2012-10-17
03 Adrian Farrel
[Ballot comment]
I am moving my Discuss to a Comment.

The substance is addressing the issues raised during IETF Last Call period by Manav Bhatia …
[Ballot comment]
I am moving my Discuss to a Comment.

The substance is addressing the issues raised during IETF Last Call period by Manav Bhatia resulting from his Routing Directorate Review (see http://www.ietf.org/mail-archive/web/rtg-dir/current/msg01731.html).

The concern that led to this being a Discuss was that the issues were raised during IETF last call and should have been addressed at that time.

I understand that the authors are working on a new revision, and since the issues themselves would only have merited being entered as Comments, I am down-grading to a Comment.
2012-10-17
03 Adrian Farrel [Ballot Position Update] Position for Adrian Farrel has been changed to No Objection from Discuss
2012-09-20
03 Tero Kivinen Closed request for Last Call review by SECDIR with state 'No Response'
2012-08-30
03 Cindy Morgan State changed to IESG Evaluation::Revised ID Needed from Waiting for AD Go-Ahead
2012-08-30
03 Russ Housley [Ballot Position Update] Position for Russ Housley has been changed to No Objection from Discuss
2012-08-30
03 Pete Resnick
[Ballot comment]
The terminology distinction between "application" and "server" is really messy in this document. When you say "application", you are *always* talking about a …
[Ballot comment]
The terminology distinction between "application" and "server" is really messy in this document. When you say "application", you are *always* talking about a piece of *server* application software (as distinct from *client* application software). When you say, e.g., "web server", sometimes you're referring to the software (the server application), and sometimes you're referring to the machine (hardware or virtual) on which it runs. For me sitting here in APP land, the use of the word "application" to only mean "server application" is weird, and the use of the word server in two ways makes this document all the more confusing to read. If you all in OPS land understand these sundry uses in this document, I'm not going to pitch a fit. But I would love to see "application" replaced by "server application" or "server software" or something like it, and I'd like to see "server", when used to talk about the machine, to be replaced by "host".
2012-08-30
03 Pete Resnick [Ballot Position Update] New position, No Objection, has been recorded for Pete Resnick
2012-08-30
03 Stewart Bryant
[Ballot comment]
There are a couple of emotive terms - massive and "a lot" which should be replaced with a clear definition of the size …
[Ballot comment]
There are a couple of emotive terms - massive and "a lot" which should be replaced with a clear definition of the size the authors are considering.

----

and/or OpenFlow [OpenFlow] infused directory assistance approaches. 

There should be a reference to IDA approaches

----

  Current implementations
  today can support ARP processing in the low thousands of ARP packets
  per second, which is several orders of magnitude lower than the rate
  at which packets can be forwarded by ASICs.

I think that this is a bit misleading, since the ARP rate was never expected
to be close to the data rate. Are the authors stating that in these systems
the rates are similar, if so I think that should be made clearer, in either case
I think the para needs rewording.

----

" Some routers can be configured to broadcast periodic gratuitous ARPs. 

Needs a reference to gratuitous ARPs

-----

7.2.  IPv6 Neighbor Discovery

It would be useful to highlight in this section that ND is orders
of magnitude more aggressive in cache expiry than IPv4. I
understand 4hrs (for IPv4) vs 35s (for IPv6) and I further
understand that this scaling issue in much smaller systems that those
considered in this text.


====

Nits

"Hypervisor:  Software running on a host that allows multiple VMs to run on the same host."

hypervisor surely more likely runs on a server
2012-08-30
03 Stewart Bryant [Ballot Position Update] New position, No Objection, has been recorded for Stewart Bryant
2012-08-29
03 Russ Housley
[Ballot discuss]

  In the discussion that followed the Gen-ART Review by Joel Halpern,
  there was a request made by Joel.  He asked:
  …
[Ballot discuss]

  In the discussion that followed the Gen-ART Review by Joel Halpern,
  there was a request made by Joel.  He asked:
  >
  > With regard to routers and ARP caches, my concern is that from what
  > I saw over the years, common practice did not seem to match the
  > SHOULD from the RFCs. I am a little remote from most implementations
  > at the moment (the ones I can check easily are a tiny fraction of
  > the market), so I was suggesting that be double-checked.
  >
  Has the requested check been made?  I did not see a response on the
  mail list.
2012-08-29
03 Russ Housley [Ballot Position Update] New position, Discuss, has been recorded for Russ Housley
2012-08-29
03 Ralph Droms
[Ballot comment]
1. In section 7.1, does a high volume of ARP traffic have more impact
on routers than on hosts or VMs?  If so, …
[Ballot comment]
1. In section 7.1, does a high volume of ARP traffic have more impact
on routers than on hosts or VMs?  If so, why?

2. In section 7.1, does the total volume of ARP traffic ever become
great enough to have a measurable impact on available traffic
capacity?

3. Does this sentence from section 7.2 imply that IPv6 stacks that
exhibit the described behavior are compliant with RFC 4861?

  Consequently, some
  implementations will send out "probe" ND queries to validate in-use
  ND entries as frequently as every 35 seconds [RFC4861].

4. I suggest dropping the sentence about the impact of VMs in section
7.3.  Any growth in the datacenter that increases the number of
addresses used in an L2 domain, whether it be the physical span of the
L2 domain or the use of VMs, will have the impact described in section
7.3.  The impact of growth will also have an impact on the scenarios
in section 7.1 and 7.2.  The specific impact of VMs is also mentioned
earlier in the document.

5. Are the three problems described in sections 7.1-3 really the only
address resolution problems in large datacenters?  How do the three
problems interact with each other (as mentioned at the end of section
7.3), when the ARP and ND problems seem to be related to CPU usage and
the MAC table issue seems to be a memory problem.

6. It was a little surprising to me that section 5 describes multicast
ND for address resolution, but section 7.2 only cites the unicast use
of ND for NUD as a problem.
2012-08-29
03 Ralph Droms [Ballot Position Update] New position, No Objection, has been recorded for Ralph Droms
2012-08-29
03 Barry Leiba [Ballot Position Update] New position, No Objection, has been recorded for Barry Leiba
2012-08-29
03 Robert Sparks [Ballot Position Update] New position, No Objection, has been recorded for Robert Sparks
2012-08-28
03 Sean Turner
[Ballot comment]
abstract: r/Our/The

abstract/s1: Is it massive scaling or scaling of massive data centers (i.e.,t those with a massive # of hosts ;)

s1: …
[Ballot comment]
abstract: r/Our/The

abstract/s1: Is it massive scaling or scaling of massive data centers (i.e.,t those with a massive # of hosts ;)

s1: r/we/this document

s1: r/aims to/lists (I sure hope it will ;)

s2: Are these internet hosts or some other kind of host?  Should it be pointing to RFC 1123?

s2: ToR an unfortunate reuse of "The Onion Router" and more commonly used in reference to the ToR Browser (https://www.torproject.org/).  Any chance of calling it SoR _ single rack switch (SRS)?

s11: Agree Stephen.
2012-08-28
03 Sean Turner [Ballot Position Update] New position, No Objection, has been recorded for Sean Turner
2012-08-28
03 Adrian Farrel
[Ballot discuss]
Updated Discuss.

I see good progress discussing the comments raised during IETF Last Call period by Manav Bhatia resulting from his Routing Directorate …
[Ballot discuss]
Updated Discuss.

I see good progress discussing the comments raised during IETF Last Call period by Manav Bhatia resulting from his Routing Directorate Review (see http://www.ietf.org/mail-archive/web/rtg-dir/current/msg01731.html).

This Discuss is a place-holder for the updated document that should result.
2012-08-28
03 Adrian Farrel Ballot discuss text updated for Adrian Farrel
2012-08-28
03 Wesley Eddy
[Ballot comment]
Just some comments that you are free to ignore ... I think there are some places where the terminology and background have been …
[Ballot comment]
Just some comments that you are free to ignore ... I think there are some places where the terminology and background have been glossed over.

For instance, I don't know what "massive" means.  That's very subjective.  The number 100,000 physical machines is given at the end of section 3, but is that where massive starts or is that somewhere in the middle of the range?

It also seems to be assumed that the switches are multilayer switches, as there's text mentioning that they can be the gateway for a subnet (in section 3).  However, in the IETF, it seems to me that we've usually been pretty careful to distinguish between devices acting as L2 switches and devices acting as L3 routers, so it seems that the concept of MLS should be introduced.  The presence of things like load balancers that use higher-layer information for switching is another way that people have gotten scalability without encountering these ARP/ND issues, or overly complicating the L2 and L3 switching configuration.

Further, the VM systems that I know of require a SAN in order to failover and migrate from one physical machine to another, they aren't moving huge distances across the datacenter because of that.  I'm not sure what the assumptions about VM mobility across physical machines are that this document makes.  It seems to assume that any VM might run on any physical machine, which is only realistic for certain types of datacenter, and certainly not all.  But maybe this is also part of the definition of "massive" in that they're doing more of a VPS type of hosting?
2012-08-28
03 Wesley Eddy [Ballot Position Update] New position, No Objection, has been recorded for Wesley Eddy
2012-08-28
03 Brian Haberman [Ballot Position Update] New position, No Objection, has been recorded for Brian Haberman
2012-08-27
03 Stephen Farrell
[Ballot comment]


This is a near-discuss. But I'm not a fan of putting use-cases or
problem statements on critical paths, so its not.

I think …
[Ballot comment]


This is a near-discuss. But I'm not a fan of putting use-cases or
problem statements on critical paths, so its not.

I think the security considerations ought to have made mention of
isolation of traffic, e.g. via separate VLANS or the moral
equivalent. In other words, I think section 11 ought be about the
security considerations of section 6 (mainly) and not a statement
about ARP which is (presumably) not the answer for armd. Since such
traffic isolation is a real requirement and a pain point for any armd
solution, and if this document does motivate armd design, then I
disagree that this document has no security implications.

Note - I'm not suggesting you try solve the pain here, but only that
you properly recognise it.
2012-08-27
03 Stephen Farrell [Ballot Position Update] New position, No Objection, has been recorded for Stephen Farrell
2012-08-23
03 (System) State changed to Waiting for AD Go-Ahead from In Last Call
2012-08-16
03 Pearl Liang
IANA has reviewed draft-ietf-armd-problem-statement-03, which is currently in Last Call, and has the following comments:

IANA understands that, upon approval of this document, there …
IANA has reviewed draft-ietf-armd-problem-statement-03, which is currently in Last Call, and has the following comments:

IANA understands that, upon approval of this document, there are no IANA Actions that need completion.
2012-08-16
03 Adrian Farrel
[Ballot discuss]
Updated Discuss with a pointer to the review comments.

Comments were raised during IETF Last Call period by Manav Bhatia resulting from his …
[Ballot discuss]
Updated Discuss with a pointer to the review comments.

Comments were raised during IETF Last Call period by Manav Bhatia resulting from his Routing Directorate Review (see http://www.ietf.org/mail-archive/web/rtg-dir/current/msg01731.html). I haven't see any response to these comments and questions.

I will pick these up and adopt them as my own Discuss issues in a future version of this Discuss if I do not see responses before the end of IETF Last Call.
2012-08-16
03 Adrian Farrel Ballot discuss text updated for Adrian Farrel
2012-08-16
03 Adrian Farrel
[Ballot discuss]
Comments were raised during IETF Last Call period by Manav Bhatia resulting from his Routing Directorate Review. I haven't see any response to …
[Ballot discuss]
Comments were raised during IETF Last Call period by Manav Bhatia resulting from his Routing Directorate Review. I haven't see any response to these comments and questions.

I will pick these up and adopt them as my own Discuss issues in a future version of this Discuss if I do not see responses before the end of IETF Last Call.
2012-08-16
03 Adrian Farrel [Ballot Position Update] New position, Discuss, has been recorded for Adrian Farrel
2012-08-16
03 Ron Bonica Ballot has been issued
2012-08-16
03 Ron Bonica [Ballot Position Update] New position, Yes, has been recorded for Ronald Bonica
2012-08-16
03 Ron Bonica Created "Approve" ballot
2012-08-16
03 Ron Bonica Placed on agenda for telechat - 2012-08-30
2012-08-10
03 Samuel Weiler Request for Last Call review by SECDIR is assigned to Dave Cridland
2012-08-10
03 Samuel Weiler Request for Last Call review by SECDIR is assigned to Dave Cridland
2012-08-09
03 Jean Mahoney Request for Last Call review by GENART is assigned to Joel Halpern
2012-08-09
03 Jean Mahoney Request for Last Call review by GENART is assigned to Joel Halpern
2012-08-09
03 Amy Vezza
The following Last Call announcement was sent out:

From: The IESG
To: IETF-Announce
CC:
Reply-To: ietf@ietf.org
Subject: Last Call:  (Problem Statement for ARMD) to Informational …
The following Last Call announcement was sent out:

From: The IESG
To: IETF-Announce
CC:
Reply-To: ietf@ietf.org
Subject: Last Call:  (Problem Statement for ARMD) to Informational RFC


The IESG has received a request from the Address Resolution for Massive
numbers of hosts in the Data center WG (armd) to consider the following
document:
- 'Problem Statement for ARMD'
  as Informational RFC

The IESG plans to make a decision in the next few weeks, and solicits
final comments on this action. Please send substantive comments to the
ietf@ietf.org mailing lists by 2012-08-23. Exceptionally, comments may be
sent to iesg@ietf.org instead. In either case, please retain the
beginning of the Subject line to allow automated sorting.

Abstract


  This document examines address resolution issues related to the
  massive scaling of data centers.  Our initial scope is relatively
  narrow.  Specifically, it focuses on address resolution (ARP and ND)
  within the data center.




The file can be obtained via
http://datatracker.ietf.org/doc/draft-ietf-armd-problem-statement/

IESG discussion can be tracked via
http://datatracker.ietf.org/doc/draft-ietf-armd-problem-statement/ballot/


No IPR declarations have been submitted directly on this I-D.


2012-08-09
03 Amy Vezza State changed to Last Call Requested from None
2012-08-09
03 Ron Bonica Last call was requested
2012-08-09
03 Ron Bonica Ballot approval text was generated
2012-08-09
03 Ron Bonica State changed to Last Call Requested from AD Evaluation
2012-08-09
03 Ron Bonica State changed to AD Evaluation from Publication Requested
2012-08-09
03 Ron Bonica Last call announcement was generated
2012-08-09
03 Ron Bonica Last call announcement was generated
2012-08-09
03 Ron Bonica Ballot writeup was changed
2012-08-09
03 Ron Bonica Ballot writeup was changed
2012-08-09
03 Ron Bonica Ballot writeup was generated
2012-08-09
03 Ron Bonica IESG process started in state Publication Requested
2012-08-09
03 (System) Earlier history may be found in the Comment Log for draft-narten-armd-problem-statement
2012-08-09
03 Ron Bonica Shepherding AD changed to Ronald Bonica
2012-08-09
03 Ron Bonica Shepherding AD changed to Ronald Bonica
2012-08-09
03 Ron Bonica Intended Status changed to Informational from None
2012-08-04
03 Benson Schliesser Changed shepherd to Linda Dunbar
2012-06-22
03 Thomas Narten New version available: draft-ietf-armd-problem-statement-03.txt
2012-03-12
02 Thomas Narten New version available: draft-ietf-armd-problem-statement-02.txt
2012-03-04
01 Benson Schliesser Annotation tag Awaiting Merge with Other Document cleared.
2012-02-21
01 (System) New version available: draft-ietf-armd-problem-statement-01.txt
2012-02-21
01 Benson Schliesser
Revision -01 completes the merge of draft-armd-datacenter-reference-arch into draft-ietf-armd-problem-statement. Chair has provided editorial feedback and comments. Awaiting co-authors response / resolution prior to announcing last …
Revision -01 completes the merge of draft-armd-datacenter-reference-arch into draft-ietf-armd-problem-statement. Chair has provided editorial feedback and comments. Awaiting co-authors response / resolution prior to announcing last call.
2012-01-09
01 Benson Schliesser
Per the ARMD Work Plan circulated to the mailing list (http://www.ietf.org/mail-archive/web/armd/current/msg00375.html):
"1. We intend to merge draft-armd-datacenter-reference-arch into draft-ietf-armd-problem-statement. The WG will have an …
Per the ARMD Work Plan circulated to the mailing list (http://www.ietf.org/mail-archive/web/armd/current/msg00375.html):
"1. We intend to merge draft-armd-datacenter-reference-arch into draft-ietf-armd-problem-statement. The WG will have an opportunity to review the resulting document, comment, and contribute any additional text.  Unless there is such activity it will be quickly moved to last-call."
2012-01-09
01 Benson Schliesser Annotation tags Awaiting Merge with Other Document, Doc Shepherd Follow-Up Underway set.
2011-10-17
00 (System) New version available: draft-ietf-armd-problem-statement-00.txt