Support for Local RIB in BGP Monitoring Protocol (BMP)
draft-ietf-grow-bmp-local-rib-12

Note: This ballot was opened for revision 10 and is now closed.

Warren Kumari Yes

Alvaro Retana (was Discuss) No Objection

Comment (2021-05-03 for -11)
Thank you for addressing my DISCUSS!

I still have some non-blocking comments which have not been addressed in the latest version.


(1) §3 (Definitions) 

   * Post-Policy Adj-RIB-Out: The result of applying outbound policy to  
   an Adj-RIB-Out. This MUST be what is actually sent to the peer. 

s/This MUST be what is actually sent to the peer./This is what is sent to the peer. 

Note that this document should not use Normative language related to what a BGP session does. In this case, that is rfc4271's job. 


(2) §5.2.1: "The Information field contains a UTF-8 string..."

Please take a look at the Shutdown Communication string definition in rfc9003  and use a similar definition. 


(3) §5.2.1: "...whose value MUST be equal to the value of the VRF or table name (e.g., RD instance name) being conveyed." 

The "value of the VRF or table name" is a local matter, right?  How can the requirement be normatively enforced?  How can the receiver enforce the "MUST"? IOW, s/MUST.../The information field contains the value of the VRF or table name... 

There's no need to redefine the TLV in §5.3. 


(4) §5.5 (Route Mirroring): "Route mirroring is not applicable to Loc-RIB and Route Mirroring messages SHOULD be ignored."  If not applicable...when is it ok not to ignore the Route Mirroring messages? IOW, why is this behavior recommended and not required? 


(6) In general, the terminology used throughout the document is well-known to BMP/BGP users but may not be to the average reader. Please add references (most can be informational). These are some examples: 

- s/Adj-RIB-In Post-Policy/Post-Policy Adj-RIB-In/g 
That is how rfc7854 defines the term. Also, please add a reference on first 
mention. 

- s/Adj-RIB-In Pre-Policy/pre-policy Adj-RIB-In/g 
Same as above. 

- Add a reference for BGP-LS (rfc7752). 

- Add a reference for "4-octet ASN" (rfc6793).

Benjamin Kaduk No Objection

Comment (2021-04-07 for -10)
I was going to call out a couple of my remarks, but the particularly important
ones seem to already be covered by Alvaro's Discuss, which I consequently support.

I also proposed some editorial suggestions in a github PR:
https://github.com/TimEvens/draft-ietf-grow-bmp-loc-rib/pull/23

Section 1

Please expand SPF and CSPF on first use
(https://www.rfc-editor.org/materials/abbrev.expansion.txt seems to only
mark "OSPF" as well-known and not in need of expansion).

      For example, the received Adj-RIB-In will be different if add-
      paths is not enabled or if maximum number of equal paths are
      different from Loc-RIB to routes advertised.

Where is "add-paths" defined?

Section 1.1

   monitored.  As shown in Figure 3, the target router Loc-RIB is
   advertised via Adj-RIB-Out to the BMP router over a standard BGP

Where is "the BMP router" defined/indicated?  From context I assume it
is "a router inserted into the BGP network for the (sole?) purpose of
re-exporting via BMP a data stream".  Perhaps it is simpler to avoid the
new term and just say "to another BGP-speaking router" and "That router
then forwards"?

      version information).  In order to derive a Loc-RIB to a router,
      the router name or other system information is needed.  The BMP

(nit): is there a missing word here around "to a router"?  ("entry"?
s/to/for/?)
I'm having a hard time parsing this sentence so I can't just make a
suggestion in my editorial PR.

Section 3

   *  Post-Policy Adj-RIB-Out: The result of applying outbound policy to
      an Adj-RIB-Out. This MUST be what is actually sent to the peer.

My memory may be failing me, but I think I remember some discussion of
this (MUST-level) requirement on Post-Policy Adj-RIB-Out some years ago,
presumably in the context of a different document.  Is this a
preexisting requirement that we could refer to, or a new requirement
being imposed here?

Section 4.1

   A new peer type is defined for Loc-RIB to distinguish that it
   represents Loc-RIB with or without RD and local instances.

I think I'm confused by the line about "with or without RD and local
instances".  Is there a choice made by the sender?  How would the
recipient know which choice was made?
Continuing on to read Section 5.1, it seems that the "peer
distinguisher" field in the per-peer header can be used by the recipient
to differentiate the cases, so perhaps it is more clear to say here
"[...] represents Loc-RIB.  Its representation can differentiate between
instances with and without RD, and use a unique local value to indicate
local instances."?

Section 4.2

   In section 4.2 of [RFC7854], the "locally sourced routes" comment
   under the L flag description is removed.  If locally sourced routes

We might want to mention this focused update to RFC 7854 in the
introduction alongisde of the mention that Section 8.2 of RFC 7854 is
completely replaced.

   are communicated using BMP, they MUST be conveyed using the Loc-RIB
   instance peer type.

It seems like this might be a breaking change during incremental
deployment, as a BMP receiver might lose a data stream it was otherwise
receiving until it gets a software update.  It seems like at a minimum
this should get called out in the "other considerations" section, and
maybe a more graceful transition procedure defined/allowed.

Section 5.x

We don't cover the Initiation/Termination messages from RFC 7854, but do
seem to have at least brief mentions of the others.  This is probably
fine, but I just wanted to confirm that the processing of those messages
is unchanged for the Loc-RIB case compared to the existing cases.

Section 5.1

   *  Peer Address: Zero-filled.  Remote peer address is not applicable.
      The V flag is not applicable with Loc-RIB Instance peer type
      considering addresses are zero-filed.

Since we split the flags off into being per-peer-type, there no longer
is a V flag defined and we may not need to have text about its
irrelevance.

Section 5.2.1

   *  Type = 3: VRF/Table Name.  The Information field contains a UTF-8
      string whose value MUST be equal to the value of the VRF or table

It is hopefully implicit, but I'd suggest repeating the language from RFC
7854 that "[t]here is no requirement to terminate the string with a null
(or any other particular) character" to avoid any doubt.  (Likewise in
Section 5.3.)

Section 5.3

   Peer down notification MUST use reason code 6.  Following the reason

Should we forward-reference Section 8.4 where we allocate that value?

Section 7

   The same considerations as in section 11 of [RFC7854] apply to this
   document.  Implementations of this protocol SHOULD require to

(side note) I think the current security considerations text is
appropriate for this document, but noticed that RFC 7854 recommended
IPsec in tunnel mode (vs. transport mode), which was a little surprising
to me.  That said, changing the recommendation now just for Loc-RIB
would be disruptive for negligible benefit, so this is just a side note
and no change is recommended.

Section 8.2

I suspect that we want to create a new sub-registry for the peer-type-3
flags.

   This document defines a new flag (Section 4.2) and proposes that peer
   flags are specific to the peer type:

I'd suggest not using the word "proposes" anymore; we're basically at
the approval stage and it's more like a "going to happen" thing than a
"proposal".

Section 8.3

Is there actually an existing registry for PEER UP informational message
TLV types?  I don't see anything listed at
https://www.iana.org/assignments/bmp-parameters/bmp-parameters.xhtml
that matches up to that very well.

Erik Kline No Objection

Comment (2021-04-04 for -10)
Mostly nits.

[ section 1.1 ]

* "directly effects" -> "directly affects"... I think

[ section 3 ]

* "refers to an instance of an instance of" ->  "refers to an instance of"?

[ section 4.1 ]

* "RD" has two expansions, according to the RFC Editor abbreviation list.
  Consider expanding on first use.

[ section 5, 6.1.3 ]

* "ie." -> "i.e.", I think

[ section 6.1.2 ]

* "that only IBGP and EBGP that should be sent" ->
  "that only IBGP and EBGP should be sent"

[ section 7 ]

* "SHOULD require to establish sessions" ->
  "SHOULD require establishing sessions", or something?

  Should these sessions also be required to be secure?

Francesca Palombini No Objection

Comment (2021-04-04 for -10)
Thank you for the work on this document. I only have one minor comment, feel free to take or ignore my suggestion, which is only about readability.

Francesca

1. -----

   represents Loc-RIB with or without RD and local instances.

   ...

   include IBGP, EBGP, and IGP but only routes from EBGP should be sent

FP: suggestion to expand RD, IBGP and EBGP. (Note - I am aware that RD comes from 7854, which does not expand it either, but I do believe it would improve the text)

John Scudder No Objection

Comment (2021-04-06 for -10)
I agree with Alvaro’s comments. Some additional of my own:

1. I appreciate that section 1.1, "Alternative Method to Monitor Loc-RIB” was necessary to guide WG discussion (not to mention as a way of illustrating the problem to shortsighted authors of the base specification) but in the RFC, it will be clutter for the implementor to skip over. IMO it would be kind of you to move it to an appendix (or even remove it entirely, if you wish). 

2. "BGP Instance: refers to an instance of an instance of BGP-4”. Did you mean the repetition of “instance of”? Or is this an instance of the

Paris in the
the spring

problem? (I assume it is.)

3. In this definition:

   *  Post-Policy Adj-RIB-Out: The result of applying outbound policy to
      an Adj-RIB-Out. This MUST be what is actually sent to the peer.

The MUST seems inappropriate, unless you are imposing a new requirement — and if so, what are you imposing the requirement on? The concept of a “Post-Policy Adj-RIB-Out” (which implies the existence of a Pre-Policy Adj-RIB-Out”, as you also define) doesn’t even exist in RFC 4271 — the Adj-RIB-Out is post-policy by definition.

If you were to remove section 1.1, you could also remove the definitions of Adj-RIB-Out, and this whole problem would go away as if by magic. :-)

4. Section 4.1:

   A new peer type is defined for Loc-RIB to distinguish that it
   represents Loc-RIB with or without RD and local instances.

"Distinguish whether”?

5. Section 4.2:

   In section 4.2 of [RFC7854] the "locally sourced routes" comment
   under the L flag description is removed.  If locally sourced routes
   are communicated using BMP, they MUST be conveyed using the Loc-RIB
   instance peer type.

Given that you aren’t bumping the version number or otherwise signaling that this spec is in use, how is the collector expected to know this? A priori by configuration? It seems as though a prudent collector implementation would still have to support the RFC 7854 encoding as well. At first glance that seems OK — the collector could interpret the L-bit and also the Loc-RIB Instance Peer type. It would be nice to have some words about this.

Having written the above, I think the text I’ve quoted, per se, is fine — you can mandate that the new encoding always be used. (And too bad for old collectors that don’t understand the format, I guess.)

6. Section 5, nit:

   other protocols into BGP or otherwise locally originated (ie.
   aggregate routes).

If you mean “for example” it’s “e.g.” (but it’s clearer IMO to write “for example”). If you mean “in other words”, it’s “i.e.” (with two periods, but it’s clearer IMO to write “in other words”).

7. Section 5.1, nit:

   All peer messages that include a per-peer header section 4.2 of
   [RFC7854] MUST use the following values:

The way this text has rendered, the reference needs parentheses around it. Same for this one:

   *  Peer BGP ID: Set to the BGP instance global or RD (e.g.  VRF)
      specific router-id section 1.1 of [RFC7854].

8. Section 5.2.1:

   *  Type = 3: VRF/Table Name.  The Information field contains a UTF-8
      string whose value MUST be equal to the value of the VRF or table
      name (e.g.  RD instance name) being conveyed.  The string size
      MUST be within the range of 1 to 255 bytes.

I take it the empty string isn’t allowed, by design.

9. Section 5.2.1:

   Multiple TLVs of the same type can be repeated as part of the same
   message, for example to convey a filtered view of a VRF.  A BMP
   receiver should append multiple TLVs of the same type to a set in
   order to support alternate or additional names for the same peer.  If
   multiple strings are included, their ordering MUST be preserved when
   they are reported.

The way this is written, it’s not clear if you’re modifying the base spec to permit multiple TLVs of the same arbitrary type, informatively stating a property of the base spec, or saying this is how the VRF/Table Name TLV is to be handled. Looking back over RFC 7854, I think it is probably the latter — you want to say only how the VRF/Table Name TLV is used. In that case, I think you should change “same type” to “this type”. The indent of the paragraph seems wrong, too — shouldn’t it be part of the bullet text?

Also — “a set” of what? 

10. Section 5.3:

   Peer down notification MUST use reason code 6.  Following the reason
   is data in TLV format.  The following peer Down information TLV type
   is defined:

“Peer Down” (so capitalized).

Also, if multiple names were provided with the Peer Up, do they have to be provided identically with the Peer Down? As you’ve specced it right now, only a single name may be carried with the Peer Down (or at least you don’t explicitly permit multiple).

11. Section 5.4.2, “Granularity”:

You say the speaker SHOULD do state compression. The requirement is fine in the abstract (in other words, I hope BMP implementations will do this) but it seems out of place in a spec that’s specifically about Loc-RIB. Surely this requirement is for BMP in general?

Speaking of requirements, you give this as a SHOULD. Under what circumstances would you contemplate an implementation might not do this, and (assuming you keep the section at all) can you be specific? An ideal way of doing this is to add a MAY clause of the form “… although an implementation MAY forgo doing state compression on February 29 of any year.” (This problem goes away if you eliminate section 5.4.2 of course.)

12. Section 6.1.1. 

I had a hard time following this section. Let me break it down:

   There MUST be multiple emulated peers for each Loc-RIB instance, 

The plain English of this seems to be telling me that each Loc-RIB instance, regardless of its attributes, MUST have at least two emulated peers (“multiple”). That doesn’t make much sense to me.

   such
   as with VRFs.  

I can’t discern what “such as with VRFs” means in this sentence.

   The BMP receiver identifies the Loc-RIB by the peer
   header distinguisher and BGP ID.  The BMP receiver uses the VRF/
   Table Name from the PEER UP information to associate a name to the
   Loc-RIB.

This part is saying that the table name is just for pretty printing. OK. 

   In some implementations, it might be required to have more than one
   emulated peer for Loc-RIB to convey different address families for
   the same Loc-RIB.  

OK. Presumably the unsaid part is that other implementations will convey all address families using a single emulated peer? 

By the way, what is the receiver supposed to do if the different emulated peers in this situation, advertise different VRF/Table Names? You’ve already said that the VRF/Table Name is a non-key value, so presumably even if the string is different, they still map to the same Loc-RIB — but what is the Loc-RIB’s name?

   In this case, the peer distinguisher and BGP ID
   should be the same since it represents the same Loc-RIB instance.
   Each emulated peer instance MUST send a PEER UP with the OPEN message
   indicating the address family capabilities.  

OK.

   A BMP receiver MUST
   process these capabilities to know which peer belongs to which
   address family.

Wouldn’t this information also be present in the (encapsulated) UPDATE messages, which will each have an AFI/SAFI embedded? 

13. Section 6.1.2:

   filtered.  If multiple filters are associated to the same Loc-RIB, a
   Table Name MUST be used in order to allow a BMP receiver to make the
   right associations.

I’m not sure what the implementor is supposed to do with the MUST. Is the point here that if the router is supplying, for example, EBGP routes to one monitoring station and IBGP routes to another, it should have a table name like “EBGP-routes” associated with one and “IBGP-routes” with the other? That would be better than nothing of course but absent additional configuration on both ends, doesn’t do anything to allow the receiver to automatically “make the right associations”, it’s just a string — do you intend that it be possible to programmatically derive the string and/or derive meaning from the string? I’m guessing not, and that you’ve deliberately left the formation of the string up to the user. Do you intend for the implementation to enforce uniqueness of the string (you don’t say this explicitly)?

14. Section 8.2:

   This document defines a new flag (Section 4.2) and proposes that peer
   flags are specific to the peer type:

I think you will need to give IANA more detailed instructions, since you are asking them to reorganize the registry. Take a look at the existing Peer Flags registry. It doesn’t have a column for peer type. How do you want them to reorganize it to reflect your wishes?

15. Section 8, all subsections:

There are some other problems with the IANA section, but since you’ve already gotten early allocations from IANA for all these code points, probably the most straightforward way to fix them is to look at what’s actually registered, and per subsection, say something like “IANA has temporarily registered 'VRF/Table Name’ with value 3 in the 'BMP Initiation Message TLVs’ registry. IANA is requested to make this registration permanent and update the reference to be this document.”

Lars Eggert No Objection

Comment (2021-04-06 for -10)
All comments below are very minor change suggestions that you may choose to
incorporate in some way (or ignore), as you see fit. There is no need to let me
know what you did with these suggestions.

Section 2, paragraph 2, nit:
-    14 RFC 2119 [RFC2119] RFC 8174 [RFC8174] when, and only when, they
-      ---------          ---------
+    14 [RFC2119] [RFC8174] when, and only when, they

Section 5, paragraph 2, nit:
-    other protocols into BGP or otherwise locally originated (ie.
+    other protocols into BGP or otherwise locally originated (i.e.,
+                                                               +  +

Section 6.1.3, paragraph 2, nit:
-    an existing BMP session, ie. changes to filtering and table names,
+    an existing BMP session, i.e., changes to filtering and table names,
+                              +  +

Martin Duke No Objection

Comment (2021-03-30 for -10)
It would help to expand acronyms on first use (RD, VRF, etc.)

Martin Vigoureux No Objection

Murray Kucherawy No Objection

Comment (2021-04-07 for -10)
Something for the IESG first.  In the shepherd writeup:

-- BEGIN --
(1) What type of RFC is being requested (BCP, Proposed Standard, Internet Standard, Informational, Experimental, or Historic)? Why is this the proper type of RFC? Is this type of RFC indicated in the title page header? 

Draft-ietf-grow-bmp-local-rib is a Standards track document. 
-- END --

This is increasingly common.  There are three questions being asked, but only one is being answered, and not the most important one at that.  I'd really like it if this started getting caught someplace in the review process before IESG Evaluation.  Or, if we don't actually care about the answer anymore, we should simplify or remove the question.

As for the document content, just one thing:

Martin pointed out a couple of acronyms he'd like to see expanded.  To that list I add "BMP" and "RD".   More well-known, but still worth considering, are "SPF" and "CSPF".  Up in the applications space, where the air is admittedly thinner, "SPF" is a protocol for expressing email policy.

Robert Wilton No Objection

Comment (2021-04-08 for -10)
Thanks for this document.  It seems like a useful addition to me.

A couple of non blocking comments, that you can either decide to act on, or ignore, as you wish:

(1) I was slightly confused by the first sentence in the introduction:

   This document defines a mechanism to monitor the BGP Loc-RIB state of
   remote BGP instances without the need to establish BGP peering
   sessions.

Due to the use of the word "remote" I had interpreted this as a router in the network monitoring the BGP Loc-RIB state of another BGP peer in the network.  Rather than a management agent using BMP to monitor a the Loc-RIB or a BGP instance.  Hence, please consider removing the word "remote", or otherwise reword this sentence.

(2) I wasn't sure whether section 1.1 was useful in the main part of the document.  It seems to describe an alternative approach that it tricky to get right and doesn't work as well.  I would have probably have put all of section 1.1 into an appendix and then reference/summarize it in the introduction to justify why exporting the Loc-RIB table directly is helpful.

Regards,
Rob

Roman Danyliw No Objection

Comment (2021-04-07 for -10)
No email
send info
Thank you to Chris Lonvick for the SECDIR review.  Also, thank you to the authors for responding to it.

* Section 8.  Editorial nit.  Consider using a reference instead of an inline URL for the BMP parameters registry.

Zaheduzzaman Sarker No Objection

Comment (2021-04-06 for -10)
No email
send info
please expand the acronym.

Éric Vyncke No Objection

Comment (2021-04-06 for -10)
Thank you for the work put into this document. 

Please find below  some non-blocking COMMENT points (but replies would be appreciated), and one nit.

I hope that this helps to improve the document,

Regards,

-éric

== COMMENTS ==

-- Section 1 --
In figure 2, suggest to extend 'Redistributed or originated into BGP' to the IS-IS arrow (if not mistaken).

-- Section 1.1 --
Suggest to make it clear in §1 that this approach was rejected.

"At scale this becomes overly complex and error prone." should probably be made more factual.

== NITS ==

-- Section 4.2 --
s/locally sourced routes/locally-sourced routes/ ?