Skip to main content

The Resource Public Key Infrastructure (RPKI) to Router Protocol, Version 2
draft-ietf-sidrops-8210bis-12

Note: This ballot was opened for revision 08 and is now closed.

Warren Kumari
Yes
Erik Kline
No Objection
Comment (2022-06-14 for -08) Not sent
# Internet AD comments for {draft-ietf-sidrops-8210bis-08}
CC @ekline

## Nits

### S4

* "twoo"
John Scudder
No Objection
Comment (2022-06-15 for -09) Sent
Thanks for this document. Note, I reviewed only the diff between RFC 8210 and draft-ietf-sidrops-8210bis-09, not the full text. 

I have a few comments on the parts I reviewed.

1. RFC 8210 said the SKI was 20 octets. This spec in §5.1 says it’s 20 bits:

   Subject Key Identifier:  The 20-bit Subject Key Identifier (SKI)
      value of a router key, as described in [RFC6487].

I think RFC 8210 was right. Here is what RFC 6487 says about SKIs:

4.8.2.  Subject Key Identifier

   This extension MUST appear in all resource certificates.  This
   extension is non-critical.

   The Key Identifier used for resource certificates is the 160-bit
   SHA-1 hash of the value of the DER-encoded ASN.1 bit string of the
   Subject Public Key, as described in Section 4.2.1.2 of [RFC5280].

That ain’t no 20 bits. 

2. In §5.12,

                                                             This is to
   avoid a race condition when a BGP announcement is received between a
   withdrawn ASPA PDU and a newly announced ASPA PDU.  Therefore, the
   cache MUST deliver the complete data of an ASPA record in a single
   ASPA PDU.

I found this to be more confusing than helpful. Maybe it’s just the BGP implementor in me, but the implicit withdraw semantics specified earlier in the paragraph are as natural as breathing; breaking in to justify them makes me briefly strip a gear. IMO these sentences could be removed (but do as you will).

3. Again in §5.12, next paragraph,

   The router should see at most one ASPA for a given AFI from a cache
   for a particular Customer Autonomous System Number active at any
   time. 

Isn’t it considerably stronger than “should”? In the previous paragraph you say ASPAs have implicit-withdraw semantics, so if that’s properly implemented it’s not physically possible to see more than one ASPA for a given (AFI, Customer ASN) from a cache at any given time. 

So depending on what you’re trying to do with this sentence, you could write it something like “Because of the semantics defined above, the router can see at most…” 

3. You introduce a new field (the AFI Flags field of the ASPA PDU) with reserved bits, but don’t define a registry for allocating those bits. I suppose this is on purpose, given the practice in this WG of issuing whole new protocol specs instead of piecemeal extensions?

4. A bit picky perhaps (who, ME?), but (still in §5.12) in

   The Customer Autonomous System Number is the 32-bit Autonomous System
   Number of the customer which authenticated the PDU.  For a given AFI,
   there MUST be one and only one ASPA for a Customer Autonomous System
   Number active in the router at any time.

I don’t think the customer authenticated the *PDU* per se, did they? They signed something in the RPKI, I guess, which the cache scraped up and turned into a PDU. The customer has no direct connection with the PDU.

5. It seems as though this, in §7, could be tightened up:

   If a cache which supports version N receives a query from a router
   which specifies its highest supported version Q < N, the cache MUST
   downgrade to protocol version Q [RFC6810] or [RFC8210] or send a
   version 2 Error Report PDU with Error Code 4 ("Unsupported Protocol
   Version") and terminate the connection; in which case the Arbitrary
   Bytes field of the ERROR Report PDU MUST be a list of one octet
   binary integers indicating the version numbers the cache supports.
   The router MUST choose the highest mutally supported version.  If
   there are none, the router MUST abort the session, sending a version
   2 Error Report PDU with Error Code 4 ("Unsupported Protocol
   Version").

Perhaps something like

   If a cache which supports version N receives a query from a router
   which specifies its highest supported version Q < N, the cache MUST
   downgrade to protocol version Q [RFC6810] or [RFC8210]. If version
   Q is not supported, the router MUST abort the session, sending a version
   2 Error Report PDU with Error Code 4 ("Unsupported Protocol
   Version") and Arbitrary Bytes field a list of one octet
   binary integers indicating the version numbers the cache supports.

6. In §13, “Reliable transport protocols (i.e. not raw TCP)” kind of bugged me, since TCP is the canonical exemplar of a reliable transport. Do you want some adjective other than “reliable”? 

Nits:

s/epoc/epoch/

s/      specified address family other ASes./      specified address family to other ASes./ (you’re missing a “to”)

s/Vidated/Validated/ (2 places)

S/mutally/mutually/

I favor the pluralization “ASes” over “ASs”, not for juvenile reasons but because “AS” is typically read “ay ess” and “ess” is pluralized “esses”. Anyway I’m sure the RFC Editor has their own opinions about this, do as you will.
Murray Kucherawy
No Objection
Comment (2022-06-16 for -09) Sent
I support Martin's DISCUSS.

The shepherd writeup, in question one, asks "What type of RFC is being requested (BCP, Proposed Standard, Internet Standard, Informational, Experimental, or Historic)? Why is this the proper type of RFC? Is this type of RFC indicated in the title page header?"  Only the first of these questions is answered.  Also the answer to the question on document quality describes what this document does, but appears not to answer directly the question about existing implementations of the extension/update.

The SHOULD at the end of Section 4 is bare; why would one not try to reestablish a connection?  I think there ought to be some guidance here, or change it to a MUST or MAY.

Nit:

* Section 5.1: s/epoc/epoch/
Paul Wouters
No Objection
Comment (2022-06-15 for -09) Not sent
I support the three discusses, especially Roman's remark about copying in security text from two previous RFC documents without updating, and Martin's remark about Update vs Obsolete.
Roman Danyliw
(was Discuss) No Objection
Comment (2022-06-16 for -09) Sent for earlier
Thanks for addressing my DISCUSS and COMMENT feedback.
Zaheduzzaman Sarker
No Objection
Comment (2022-06-15 for -09) Not sent
Thanks for working on this -bis. 

I would also like to know why this is not obsoleting 8210. So supporting Martin and Rob's discusses.
Alvaro Retana Former IESG member
No Objection
No Objection (2022-06-15 for -09) Sent
§5.12

These two paragraphs seem to describe the same case (withdraw) but specify different behavior:

   The announce/withdraw flag set to 0 indicates removal of the ASPA 
   record in total.  Here, only the AFI and the customer AS of the ASPA 
   record MUST be provided, the Provider AS Count as well as the Provider 
   AS Numbers list MUST be zero.
   ...
   Receipt of an ASPA PDU with the Flags field indicating Delete is an
   explicit withdraw from the router of the entire ASPA data for that
   customer AS.  While the Provider AS Count and the Provider AS Numbers
   MUST be ignored by the router when the Flags field indicates a
   Delete, the cache MUST set the Provider AS Count to zero, and have a
   null Provider AS Numbers list.


Both require the setting of the Provider AS Count and Provider AS Numbers fields, but only the latter says that these fields "MUST be ignored".  Is that the expected behavior?  Please specify the behavior only once.

If the router is not required to ignore the fields (as initially specified), what should the receiver do if the Provider AS Count or the Provider AS Numbers are not 0?  Should that trigger an error?
Andrew Alston Former IESG member
(was Discuss, No Objection) No Objection
No Objection (2022-06-19 for -10) Sent
Many thank's for the update that addresses my previous discuss, I appreciate the willingness of the authors to work to resolve the issues that have been seen so expeditiously.

As such, am clearing the discuss.
Lars Eggert Former IESG member
No Objection
No Objection (2022-06-16 for -09) Sent
# GEN AD review of draft-ietf-sidrops-8210bis-09

CC @larseggert

Thanks to Stewart Bryant for the General Area Review Team (Gen-ART) review
(https://mailarchive.ietf.org/arch/msg/gen-art/XAvuQVjZUdo06MI1SBmOKlCSHy8).

## Comments

### "Abstract", paragraph 2
```
     This document describes version 2 of the RPKI-Router protocol.  RFC
     6810 describes version 0, and RFC 8210 describes version 1.  This
     document updates and replaces RFC 8210.
```
Agree with Martin's DISCUSS and Roman's comment. The document relationships are
unclear.

### Section 5.11, paragraph 6
```
     If the error is associated with a PDU of excessive length, i.e., too
     long to be any legal PDU other than another Error Report, or a
     possibly corrupt length, the Erroneous PDU field MAY be truncated.
```
Would it be useful to make this truncation explicit to the receiver of the error
report?

### Section 8.1, paragraph 1
```
     When a transport connection is first established, the router MUST
     send either a Reset Query or a Serial Query.  A Serial Query would be
     appropriate if the router has significant unexpired data from a
     broken session with the same cache and remembers the Session ID of
     that session, in which case a Serial Query containing the Session ID
```
How can a router tell if the unexpired data from a broken session with the same
cache it still holds is "significant"?

### Section 9, paragraph 6
```
     If unprotected TCP is the transport, the cache and routers MUST be on
     the same trusted and controlled network.
```
Do we really need to allow unencrypted TCP as a transport for this in 2022? This
seems risky even on a "trusted and controlled network", even if the protocol
wasn't used for disseminating sensitive information.

### Section 9, paragraph 9
```
     *  Caches and routers MAY use TCP MD5 transport [RFC2385] using the
        rpki-rtr port if no other protected transport is available.  Note
        that TCP MD5 has been obsoleted by TCP-AO [RFC5925].
```
Do we really still need to allow TCP MD5, which has been obsoleted for over a
decade and has had known issues for much longer than that?

### DOWNREFs

DOWNREF `[I-D.ietf-sidrops-aspa-profile]` from this Proposed Standard to
`draft-ietf-sidrops-aspa-profile` of unknown standards level. (Chairs should
indicate the standards level of draft-ietf-sidrops-aspa-profile in
the datatracker, since the one in its text is under author control.)

## Nits

All comments below are about very minor potential issues that you may choose to
address in some way - or ignore - as you see fit. Some were flagged by
automated tools (via https://github.com/larseggert/ietf-reviewtool), so there
will likely be some false positives. There is no need to let me know what you
did with these suggestions.

### Typos

#### Section 4, paragraph 10
```
-    with twoo many failed requests.
-          -
```

#### Section 5.5, paragraph 5
```
-    This PDU carries an [RFC6811] Vidated ROA Payload (VRP) for an IPv4
+    This PDU carries an [RFC6811] Validated ROA Payload (VRP) for an IPv4
+                                   ++
```

#### Section 5.7, paragraph 2
```
-    This PDU carries an [RFC6811] Vidated ROA Payload (VRP) for an IPv6
+    This PDU carries an [RFC6811] Validated ROA Payload (VRP) for an IPv6
+                                   ++
```

#### Section 7, paragraph 2
```
-    The router MUST choose the highest mutally supported version.  If
+    The router MUST choose the highest mutually supported version.  If
+                                          +
```

### Section 5.1, paragraph 2
```
     Protocol Version:  An 8-bit unsigned integer, currently 2, denoting
        the version of this protocol.
```
"Currently 2" reads a bit odd. Maybe "2 for this version of the protocol"?

### Uncited references

Uncited references: `[RFC8126]`.

### Outdated references

Reference `[RFC2385]` to `RFC2385`, which was obsoleted by `RFC5925` (this may
be on purpose).

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues. Review generated by the [`ietf-reviewtool`][IRT].

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments
[IRT]: https://github.com/larseggert/ietf-reviewtool
Martin Duke Former IESG member
(was Discuss) No Objection
No Objection (2022-06-16 for -09) Sent
I've lifted the DISCUSS about Updates vs. Obsoletes -- my personal preference would be for this to obsolete v1 and v0, as v2 is a strict improvement on those, and make it very clear in the abstract and introduction that v0/v1 are likely to persist for a long time. This would be clearest to the wider IETF community about what is happening -- observers do not expect that devices that run obsoleted protocol versions are immediately thrown in the dumpster, but that new deployments will use the latest version.

However, I now understand that this community understands the term differently, unfortunately. So if they feel very strongly that it should be "Updates", I'm not going to block the document over it.

Older commments:

Thanks to Michael Tuexen for the TSVART review.

(Section 6) "Caches MUST set Expire Interval to a value larger than either Refresh Interval or Retry Interval."

This would seem to allow a configuration where Refresh > Expire > Retry, which would result in records constantly timing out without a refresh, no?

[Not a DISCUSS because it's a holdover from 8210]

Nits:
(Section 4) s/twoo/too

(Section 10) s/server/cache
Robert Wilton Former IESG member
(was Discuss) No Objection
No Objection (2022-06-17 for -10) Sent
[Updated ballot to clear my Discuss - thanks for accommodating my concern.]

1. 
I also support Martin's Discuss - I don't understand why publishing v2 of this protocol wouldn't obsolete v1 (on the assumption that new deployments should use v2 in preference to v1).

2. Section 5.1:
      Using persistent storage for the Session ID or a clock-based
      scheme for generating Session IDs should avoid the risk of Session
      ID collisions.

      The Session ID might be a pseudorandom value, a strictly
      increasing value if the cache has reliable storage, et cetera.  A
      seconds-since-epoch timestamp value such as the POSIX time()
      function makes a good Session ID value.
      
Given that session-id is only 16 bits, it feels like both pseudorandom and using POSIX time() could quite reasonably cause occasional collisions.  Using persistent storage to generate the session-id would likely decrease the frequency of collisions.  Another idea could be to also randomly generate the starting serial number, since this would seem to greatly increase the entropy and reduce the chance for a collision.

3.  Section 5.10.  Router Key

I found the "Router Key" PDU hardest to understand what it is and what it is for.  I would suggest having a bit more text before the packet field diagram to indicate what this PDU is.

4. Section 8.3:
   When a router receives this kind of error, the router SHOULD attempt
   to connect to any other caches in its cache list, in preference
   order.  If no other caches are available, the router MUST issue
   periodic Reset Queries until it gets a new usable load from the
   cache.

Are these periodic Reset Queries still expected to be "once per 2 hours" (as per section 6)?


5. section: 7.  Protocol Version Negotiation

   downgrade to protocol version Q [RFC6810] or [RFC8210] or send a
   version 2 Error Report PDU with Error Code 4 ("Unsupported Protocol
   Version") and terminate the connection; in which case the Arbitrary
   Text field of the ERROR Report PDU MUST be a list of one octet binary
   integers indicating the version numbers the cache supports.

My reading of this text is it is overwriting what a client would expect to be a text field with binary data.  At best, an peer supporting an older version wouldn't understand this and more plausibly it would regard it as being badly formed.

Why not encode the supported versions in a comma separated values in the error string?  Alternatively you could just keep a simpler versioning strategy of the client trying earlier versions in decreasing sequence until it finds a version that both are compatible with.
   
6. Section 9.  Transport

   *  Caches and routers MAY use TCP MD5 transport [RFC5925] using the
      rpki-rtr port.  Note that TCP MD5 has been obsoleted by TCP-AO
      [RFC5925].
      
Since this is describing version 2 of the protocol, why not remove this option to reduce the list of "more protected protocols".  In fact, given this is a new protocol version, I was wondering whether it would be feasible to reduce this list further, but perhaps that puts too much burden on implementations.

7. Is there any YANG configuration model, or plans to generate one, to manage the required device configuration?  Perhaps this is being done as part of the BGP YANG model?

Nit:

      RPKI data.  While a cache is receiving updates, new incoming data
      and implicit deletes are associated with the new serial but MUST
      NOT be sent until the fetch is complete.
      
new serial => new serial number

Regards,
Rob