Operations, Administration, and Maintenance (OAM) in Segment Routing Networks with IPv6 Data plane (SRv6)
draft-ietf-6man-spring-srv6-oam-11

Summary: Has 3 DISCUSSes. Has enough positions to pass once DISCUSS positions are resolved.

Benjamin Kaduk Discuss

Discuss (2021-06-02)
In Section 3.1.2, we say that:

   o  If the target SID (2001:db8:B:4::) is not locally instantiated,
      the packet is discarded

However, RFC 8754 §4.3.2 seems to say that the next header is processed
in this case.   Only if the target SID is both not locally instantiated
and does not represent a local interface will the packet be discarded,
if I understand correctly.  (Similarly for the analogous statement in
§3.2.2.)


There's also quite a few other internal incosistencies in this document,
such as copy/paste chunks that refer to "N4" as executing a given SID in
a scenario where it is actually a different node doing so, many
instances where a given IP address or SID does not match up with the
addressing structure listed in Section 1.3, places where we seem to say
that an SR ingress node can be a classic IPv6 node that lacks SRv6
capabilities, etc.  Individually, many of
these would be nit-level (and indeed are called out specifically in the
NITS section of my ballot comment), but collectively they seem to
indicate a document that is not yet in publishable state.
Comment (2021-06-02)
Thanks to Dan Harkins for the secdir review and Stig Venaas for the
opsdir review (with observation on the security considerations) and the
authors for updating in response to them.  I support Roman's discuss
position to make the remaining updates.

The setup introducing a couple of the examples mentions the assumed link
metric on the links in question, but none of the procedures we describe
actually make use the assumed metric information -- it seems we could
just as easily omit mention of it.

Section 1.3

      Nodes N3, N5 and N6 are IPv6 nodes that are not SRv6-capable.
      Such nodes are referred as classic IPv6 nodes.

Can I suggest using a different adjective than "classic" for this, perhaps
"standard"?  The word "classic" can come with some connotations of being
old, outdated, or legacy, and I don't think we have IETF consensus that
SRv6 is the next evolution of IPv6 that relegates "classic IPv6" to such
a legacy status.

      A SID at node k with locator block 2001:db8:B::/48 and function F
      is represented by 2001:db8:B:k:F::.
   [...]
      2001:db8:B:k:Cij:: is explicitly allocated as the END.X SID at
      node k towards neighbor node i via jth Link between node i and
      node k.  e.g., 2001:db8:B:2:C31:: represents END.X at N2 towards

What is the "function F" in this example?  My understanding was that the
function would correspond to just End.X (and thus the value "C" for that
16-bit component), with the "ij" information needing to be in the
"argument".

Section 2.1.1

   The O-flag in SRH is used as a marking-bit in the user packets to
   trigger the telemetry data collection and export at the segment
   endpoints.

If it's to be set in "user packets", who exactly is it that sets the mark?

   controller for monitoring and analytics.  Similarly, without the loss
   of generality, this document assumes requested information elements
   are configured by the management plane through data set templates
   (e.g., as in IPFIX [RFC7012]).

Can we say anything about the scope for which the set of requested
information elements are configured?  I mostly assume that it's expected
to, say, configure node A to export some information on seeing the
O-flag, and configure node B to export different information on seeing
the O-flag.  But can it be configured at a finer granularity than just
"node", e.g., at a per-flow level?

   If the telemetry data from the ultimate segment in the segment-list
   is required, a penultimate segment SHOULD NOT be a Penultimate
   Segment Pop (PSP) SID.  When the penultimate segment is a PSP SID,
   the SRH will be removed and the O-flag will not be processed at the
   ultimate segment.

This looks like a statement of fact to me, with no need to strengthen it
by normative langauge.  If you need telemetry from the ultimate segment,
and PSP is used, you won't get telemetry based on the SRH O-flag, and
that's that.

Section 2.2

   IPv6 OAM operations can be performed for any SRv6 SID whose behavior
   allows Upper Layer Header processing for an applicable OAM payload
   (e.g., ICMP, UDP).

What options are available for SIDs whose behavior does not allow such
ULH processing?

   to the correct outgoing interface.  To exercise the behavior of a
   target SID, the OAM operation SHOULD construct the probe in a manner
   similar to a data packet that exercises the SID behavior, i.e. to
   include that SID as a transit SID in either an SRH or IPv6 DA of an
   outer IPv6 header or as appropriate based on the definition of the
   SID behavior.

I think I understand what is meant by putting the target SID as a
transit SID in a SRH (or encapsulated analogue).  I'm not sure how this
would specifically validate that the node is exercising the behavior in
question (End.X here, so switching to the proper outgoing interface).
That is, if I'm trying to ping a given SID and confirm that it does
End.X properly, how does just adding an SRH confirm the cross-connect
part if I am still pinging that SID?  Do I need to target the ping at
the "next" SID in the SID list in order to actually use the
cross-connect?

Section 3.1.1

   IGP metric set to 100.  User issues a ping from node N1 to a loopback
   of node 5, via segment list <2001:db8:B:2:C31::, 2001:db8:B:4:C52::>.

How does a *user* issue a ping (directly) with an associated segment
list?  This seems to no longer be a "classic ping" mechanism as written
... perhaps there is some automated component in the system that
translates what the user put in the packet into a segment list?  Or
should we reference some ping utility that is patched to accept
segment-list input in the vein of Figure 2?
(A similar comment applies to the traceroute analogue in Section 3.2.1.)

   o  The echo request packet at N5 arrives as an IPv6 packet with or
      without an SRH.  If N5 receives the packet with SRH, it skips SRH
      processing (SL=0).  In either case, Node N5 performs the standard
      ICMPv6 processing on the echo request and responds with the echo
      reply message to N1.  The echo reply message is IP routed.

I think the tsv-art reviewer's remark that "the echo reply message is IP
routed" could hide quite a lot, is pretty accurate.  It seems worth
stating clearly that it does not have an SRH so that the reader can work
through the consequences for deployments where there is no native IP
connectivity for the return path.

Section 3.2

   operation at the classic IPv6 nodes in an SRv6 network.  That
   includes the classic IPv6 node with ingress, egress or transit roles.

I'm a little confused at what it would mean for a "classic IPv6" node to
act as the *ingress* role.  What is ingress, if not entrance to the SR
domain and applying an SRH?

Section 3.2.1, 3.2.2

Where do the 2001:db8:1:2:21::, 2001:db8:2:3:31::, 2001:db8:3:4::41::,
2001:db8:4:5::52:: come from?  They don't seem to match the stated
structure for "nth link between node i and j at the i side"
(2001:db8:i:j:in::); was there supposed to be a separate case for "at
the j side" in the terminology section?

Section 3.3

   o  The controller N100 processes and correlates the copy of the
      packets sent from nodes N1, N4 and N7 to find segment-by-segment
      delays and provide other hybrid OAM information related to packet
      P1.

If the controller is going to coalesce timestamped data from the various
SRv6 nodes, we need to have some discussion of the requirement for
synchronized time across the SR domain (or controller knowledge of time
offsets, which is essentially equivalent).

Section 5

   service attack.  Additionally, SRH Flags are protected by the HMAC
   TLV, as described in Section 2.1.2.1 of [RFC8754].

I think we should remind the reader that the HMAC protection is very
coarse-grained, and that once an HMAC exists that allows setting the
O-flag for a given segment list, it can be used to produce an arbitrary
amount of such traffic.

   This document does not impose any additional security challenges to
   be considered beyond security threats described in [RFC4884],
   [RFC4443], [RFC0792], and [RFC8754].

We might add RFC 8986 to this list, since we are using its endpoint
behaviors in our examples.

NITS

Secton 1

   any nodes within an SRv6 domain.  Specifically, the document
   illustrates how a centralized monitoring system can monitor arbitrary
   SRv6 paths by creating the loopback probes that originates and
   terminates at the centralized monitoring system.

singular/plural mismatch "probes" vs "originates and terminates".

Section 1.3

      The IPv6 address of the nth Link between node i and j at the i
      side is represented as 2001:db8:i:j:in::, e.g., the IPv6 address
      of link6 (the 2nd link) between N3 and N4 at N3 in Figure 1 is
      2001:db8:3:4:32::.  Similarly, the IPv6 address of link5 (the 1st
      link between N3 and N4) at node 3 is 2001:db8:3:4:31::.

The expansion of "in" as the contatnation of node index i and link index
n was confusing to me on first reading.  Also, it's not entirely clear
what ordering is used for the "first link" between a given pair of
nodes, since the links themselves are given absolute indices as opposed
to indices scoped to a given pair of nodes.  (Why does 'i' need to be in
the address twice, anyway?)

      2001:db8:B:k:Cij:: is explicitly allocated as the END.X SID at
      node k towards neighbor node i via jth Link between node i and
      node k.  e.g., 2001:db8:B:2:C31:: represents END.X at N2 towards

I suggest using 'n' for the link again (instead of repurposing 'j' which
used to be a node).

(Also, RFC 8986 spells it "End.X" with two minuscule and two majuscule
letters.  Similarly for just "End", as "END" appears later on in the
document.)

Section 2.1.1

   When N receives a packet whose IPv6 DA is S and S is a local SID, the
   line S01 of the pseudo-code associated with the SID S, as defined in
   section 4.3.1.1 of [RFC8754], is appended as follows for the O-flag
   processing.

Seeing "S" right after "DA" primes me to think about "source", not
"SID"; would a different label be workable?
Also, I'd s/appended/appended to/

      Ref1: An implementation SHOULD copy and record the timestamp as
      soon as possible during packet processing. Timestamp or any other
      metadata is not
      carried in the packet forwarded to the next hop.

The pseudocode does not make the inclusion of timestamp optional for
what's sent to the OAM process, so s/or/and/

Section 3.1.1

   IGP metric set to 100.  User issues a ping from node N1 to a loopback
   of node 5, via segment list <2001:db8:B:2:C31::, 2001:db8:B:4:C52::>.

s/5/N5/

   Figure 2 contains sample output for a ping request initiated at node
   N1 to the loopback address of node N5 via a segment list

s/the loopback/a loopback/ (to match the previous paragraph).

   o  Node N2, which is an SRv6-capable node, performs the standard SRH
      processing.  Specifically, it executes the END.X behavior
      (2001:db8:B:2:C31::) and forwards the packet on link3 to N3.

I suggest "executes the End.X behavior indicated by the SID
2001:db8:B:2:C31::".  (Similarly for the other End.X behavior in this
example.)

      If 2001:db8:B:4:C52:: is a PSP SID, The penultimate node (Node N4)
      does not, should not and cannot differentiate between the data
s/T/t/

Section 3.1.2

      processing.  Specifically, it executes the END.X behavior
      (2001:db8:B:2:C31::) on the echo request packet.  If

[same comment as above]

Section 3.2.1

   If an SRv6-capable ingress node wants to traceroute to IPv6 address
   via an arbitrary segment list <S1, S2, S3>, it needs to initiate
   traceroute probe with an SR header containing the SID list <S1, S2,

s/initiate/initiate a/

   IPv6 MTU [RFC4443].  The SR header is also included in these ICMPv6
   messages initiated by the classic IPv6 transit nodes that are not
   running SRv6 software.  Specifically, a node generating ICMPv6

I'm not sure why "also" is appropriate here, since the initial example
involves N3, a classit IPv6 node, returning information for link3.  Is
that not a case of a classic IPv6 transit node that is not running SRv6
software and includes the SRH in the ICMPv6 replies that it generates?

   bound to END.X behavior 2001:db8:B:2:C31:: (link3).  Similarly, the
   information displayed for hop5 contains the incoming interface

There is no hop 5; this looks like hop 4 to me.

Section 3.2.2

   of traceroute mechanism.  The UDP encoded message to traceroute a SID
   uses the UDP ports assigned by IANA for "traceroute use".

I don't think we actually show the UDP port anywhere, so phrasing like
"would use the UDP ports assigned by IANA" might be more appropriate.

   o  When Node N2 receives the packet with hop-count > 1, it performs
      the standard SRH processing.  Specifically, it executes the END.X
      behavior (2001:db8:B:2:C31::) on the traceroute probe.  If
      2001:db8:B:2:C31:: is a PSP SID, node N4 executes the SID like any

N2 != N4; I assume this is a copy/paste issue and the "N4" should be
"N2".

   o  If the target SID (2001:db8:B:4::) is locally instantiated, the
      node processes the upper layer header.  As part of the upper layer
      header processing node N4 responses with the ICMPv6 message (Type:

s/responses/responds/

Section 3.3

      packet to be monitored via the hybrid OAM.  Node N1 sets O-flag
      during encapsulation required by policy P.  As part of setting the

"during the encapsulation"

      processing.  Specifically, it executes the END.X behavior
      (2001:db8:B:2:C31::) as described in [RFC8986] and forwards the

[same comment as in §3.1.1; also later in this section]

   o  When node N7 receives the packet P1 (2001:db8:A:1::,
      2001:db8:B:7:DT999::) (2001:db8:B:7:DT999::, 2001:db8:B:4:C52::,
      2001:db8:B:2:C31::; SL=0; O-flag=1; NH=IPv4)(IPv4
      header)(payload), it processes the O-flag.  As part of processing
      the O-flag, it sends a timestamped copy of the packet to a local
      OAM process.  The local OAM process sends a full or partial copy
      of the packet P1 to the controller N100.  The OAM process includes
      the recorded timestamp, additional OAM information like incoming
      and outgoing interface, etc. along with any applicable metadata.
      Node N4 performs the standard SRv6 SID and SRH processing on the

N4 != N7

John Scudder Discuss

Discuss (2021-06-02)
I can't get past the feeling this document is really two different documents mashed together. One is a Standards Track, 6man document, that defines the O-flag. All the meat of that document is in §2.1 and §2.2, it would make a nice, compact, readable 3 or 4 page RFC (maybe a little more once all the boilerplate was in). The other is an Informational, SPRING document, that talks about use cases at some length. It seems both cruel and counterproductive to force SRv6 implementors who are implementing the O-flag to read through the entire balance of the document just to be sure they haven't missed something important. Remember these documents will live a long time, and it seems irresponsible to clutter the essential document set with inessential use cases.

My suggestion is to split the document as outlined above.
Comment (2021-06-02)
Thanks to Stig Venaas for the Routing Directorate review.

I support Roman's discuss position.

1. The Security Directorate review (https://mailarchive.ietf.org/arch/msg/last-call/alEuF06kwZosmLsX5FkiBVwtG4k/) raises a concern about privacy that hasn't been responded to, either in email or in the draft. The review says:

   ```
      This draft defines a flag in the Segment Routing Header that when
   set will result a copy of the packet being made and forwarded for
   "telemetry data collection and export." That has tremendous security
   and privacy implications that are not mentioned at all in the Security
   Considerations. The Security Considerations just say that there's
   nothing here beyond those described in <list of other RFCs>. I don't
   think that's the case.
   
      Maybe I'm completely missing something but this sounds to me like
   it enables what we used to call "service spy mode" on a router-- take
   a flow and fork a copy off to someone else. I think there needs to be
   a lot more discussion of the implications of this.
   ```

   While the revision of the Security Considerations section may be sufficient to address the "tremendous security... implications" the reviewer raises, it does nothing to address the privacy question. The reviewer has raised a serious question and deserves a serious reply to it.

2. The examples introduced in §1.3 and used throughout use the problematic convention of distinguishing unspecified portions of the IPv6 addresses with capital letters (only), as in 2001:db8::A:k::/128 (which also has an extra :: in it, by the way). It seems self-evidently a bad idea to use valid hexadecimal characters for this purpose. Please, where you don't intend to provide a literal hex digit, use a value not in the set a-f,A-F. The cited example could be rewritten as 2001:db8:X:k::/128, for example.

   Alternately, since these are examples, shown in an example topology, it's not clear to me why you couldn't simply write them out fully in real hex.

3. You use RFC 2119 terminology incorrectly many places in the document. Recall that RFC 2119 defines SHOULD as:

   ```
   3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
      may exist valid reasons in particular circumstances to ignore a
      particular item, but the full implications must be understood and
      carefully weighed before choosing a different course.
   ```

   I generally use the rule of thumb that a SHOULD is almost a MUST. Here are the places you've used SHOULD:

   ```
         Ref1: An implementation SHOULD copy and record the timestamp as
         soon as possible during packet processing. Timestamp or any other
         metadata is not
         carried in the packet forwarded to the next hop.
   ```

   Can you suggest a "particular circumstance" in which it would be OK for an implementor to NOT copy and record the timestamp "as soon as possible"? It seems as though "as soon as possible" is already a near-infinite amount of rope to allow the implementor to do anything they want, do you really need to soften it further with SHOULD? I would say this should be MUST, or give it up and make it "should" in lower-case.

   ```
      If the telemetry data from the ultimate segment in the segment-list
      is required, a penultimate segment SHOULD NOT be a Penultimate
      Segment Pop (PSP) SID.  When the penultimate segment is a PSP SID,
      the SRH will be removed and the O-flag will not be processed at the
      ultimate segment.
   ```

   Since you are stating a law of physics here (well, figuratively) the SHOULD NOT seems especially inapt. If I've understand the document correctly, if the penultimate SID is a PSP SID then this scenario just won't work. So that's not SHOULD NOT, it's either MUST NOT or more appropriately, "must not" or "can't".

   ```
      The processing node SHOULD rate-limit the number of packets punted to
      the OAM process to a configurable rate.  This is to avoid hitting any
      performance impact on the OAM and the telemetry collection processes.
      Failure in implementing the rate limit can lead to a denial-of-
      service attack, as detailed in Section 5.
   ```

   This is a semi-legitimate SHOULD -- except, what is the "particular circumstance" in which it would be OK for an implementor to NOT rate-limit? Either state it (or at least provide some hints) or make this a MUST. (I note the Routing Directorate reviewer also asked about this and received no answer.)

   ```
      The OAM process is expected to be located on the routing node
      processing the packet.  Although the specification of the OAM process
      or the external controller operations are beyond the scope of this
      document, the OAM process SHOULD NOT be topologically distant from
      the routing node, as this is likely to create significant security
      and congestion issues. 
   ```

   I have no problem with this one :-o.

   ```
                                          To exercise the behavior of a
      target SID, the OAM operation SHOULD construct the probe in a manner
      similar to a data packet that exercises the SID behavior, i.e. to
      include that SID as a transit SID in either an SRH or IPv6 DA of an
      outer IPv6 header or as appropriate based on the definition of the
      SID behavior.
   ```

   Again, is there a "particular circumstance" in which the OAM operation can "exercise the behavior of a target SID" without doing this? If not, it's a MUST or (probably better) a "should".

4. In Section 2.1.1:

   ```
      The OAM process MUST NOT process the copy of the packet or respond to
      any upper-layer header (like ICMP, UDP, etc.) payload to prevent
      multiple evaluations of the datagram.
   ```

   "Process" is a very general term, taken literally this means the OAM process must not do anything at all with the copy it's given. Please be more specific about what you mean. I'd offer text if I had a good guess as to your intent, but I don't.

5. General comment, "classic IPv6". I find the recurrent use of the term "classic IPv6" along with "classic ping", "classic traceroute", etc, somehow jarring. We aren't marketing soft drinks! In most places you've used "classic" you could either provide no adjective at all (as with ping and traceroute) or replace with "SRv6-incapable". In some cases, the dichotomy isn't even needed, as in

   ```
      The existing mechanism to perform the reachability checks, along the   shortest path, continues to work without any modification.  The   initiator may be an SRv6 node or a classic IPv6 node.  Similarly, the   egress or transit may be an SRv6-capable node or a classic IPv6 node.
   ```

   The last sentence could easily be rewritten something like  “Any IPv6 node can initiate, transit, and egress a ping packet.” Similarly, 

   ```
      Similarly, the egress node (IPv6 classic or SRv6-capable) does not
   ```

   could simply omit the parenthetical.

6. Nit:

   ```
      o  Node N1 initiates a traceroute probe packet with a monotonically      increasing value of hop count and SRH as follows (2001:db8:A:1::,
   ```

   "Monotonic" doesn't mean "increasing by 1 each time", which is what you mean here. There are an infinite number of valid monotonic progressions that wouldn't work for traceroute.

7. Please define "USP" before use. You should probably just put "USP" and "PSP" in §1.2.

8. Nit:

   ```
      In the reference topology in Figure 1, N100 uses an IGP protocol like   OSPF or ISIS to get the topology view within the IGP domain.  N100
   ```

   That should be "IS-IS" (with hyphen).

9. In your Security Considerations you say 

   ```
      This document does not define any new protocol extensions and relies   on existing procedures defined for ICMPv6.
   ```

   Surely the O-flag, which you define, is a protocol extension.

Roman Danyliw Discuss

Discuss (2021-06-02)
The privacy implications of the O-flag needs to be more clearly articulated.  It provides a dual use capability -- there is tangible benefit for OAM use cases, but also reduces the friction for surveillance uses cases.

The SECDIR review (https://mailarchive.ietf.org/arch/msg/secdir/FeTu7x7-okw7w7-T6dZRFhJHpAo/) pointed this out in -09.  The changes made to the Security Considerations in -10 were helpful, but primarily focused on reiterating the security assumptions of the SR domain boundary and the degree of protection of the SRH.  

My recommendation would be for an explicit Privacy Considerations section with the following (approximate) text:

NEW
7.  Privacy Considerations

The per-packet marking capabilities of the O-flag provides a granular mechanism to collect telemetry.  When this collection is deployed by an operator with knowledge and consent of the users, it will enable a variety of diagnostics and monitoring to support the OAM and security operations use cases needed for resilient network operations.  However, this collection mechanism will also provide an explicit protocol mechanism to operators for surveillance and pervasive monitoring use cases done contrary to the users’ consent.
Comment (2021-06-02)
Thank you to Dan Harkins for the SECDIR review.

** Section 5.  Even with the trust assumptions of the SR domain, it would be worth mentioning that:

The security properties of the channel used to send exported packets marked by the O-flag will depend on the specific OAM processes used.  An on-path attacker able to observe this OAM channel could conduct traffic analysis, or potentially eavesdropping (depending on the OAM configuration), of this telemetry for the entire SR domain from such a vantage point.  

** Section 5.  Per “Additionally, SRH Flags are protected by the HMAC TLV, as described in Section 2.1.2.1 of [RFC8754]”, I didn’t follow to what this was referring to.  Also, isn’t this TLV optional?

Erik Kline Yes

Alvaro Retana No Objection

Comment (2021-06-01 for -10)
Just a couple of minor comments -- no need to reply.

(1) §3.3: The mention of/comparison with draft-ietf-ippm-ioam-data seems unnecessary. 

(2) rfc8174 should be a Normative reference.

Francesca Palombini No Objection

Comment (2021-06-02)
Thank you for the work on this document. A couple of minor comments below.

Francesca

1. -----

FP: I believe the document could use a sentence in the terminology pointing the reader to references that describe the terminology used (such as "Custumer Edge", "locator block", "function F", "controller", "END.X", "SegmentsLeft") - No need to transcribe each term, but it would add clarity to state "The reader is expected to be familiar with terms and concepts from .... "

2. -----

FP: Additionally, I think it would make sense to have RFC 8986 as normative reference.

3. -----

   When N receives a packet whose IPv6 DA is S and S is a local SID, the
   line S01 of the pseudo-code associated with the SID S, as defined in
   section 4.3.1.1 of [RFC8754], is appended as follows for the O-flag
   processing.

FP: suggestion to rephrase so that S01.1 becomes the subject "S01.1 is appended to the line S01 ... "

4. -----

   The processing node SHOULD rate-limit the number of packets punted to
   the OAM process to a configurable rate.  This is to avoid hitting any

FP: Should this document define a default to this rate? Or is this defined somewhere else?

5. -----

      2001:db8:B:2:C31:: is a PSP SID, node N4 executes the SID like any
      other data packet with DA = 2001:db8:B:2:C31:: and removes the

FP: (Section 3.1.2) s/N4/N2

6. -----

   2001:db8:B:7:DT999::.  2001:db8:B:7:DT999:: is a USP SID.  N1, N4,

FP: Please define (or reference where it is defined) USP SID

7. -----

   Controller N100 with the help from nodes N1, N4, N7 and implements a
   hybrid OAM mechanism using the O-flag as follows:

FP: "and" seems quite out of place.

8. -----

      Node N4 performs the standard SRv6 SID and SRH processing on the
      original packet P1.  Specifically, it executes the VPN SID

FP: (Section 3.3) s/N4/N7

Lars Eggert No Objection

Comment (2021-05-26 for -10)
Section 3.3, paragraph 3, nit:
-    The illustration is different than the In-situ OAM defined in [I.D-
-                                                                     --
-    draft-ietf-ippm-ioam-data].  This is because In-situ OAM records
-    ------
+    The illustration is different than the In-situ OAM defined in [I-D.
+                                                                    ++
+    ietf-ippm-ioam-data].  This is because In-situ OAM records

Section 3.3, paragraph 3, nit:
-    traverses a path between two points in the network [I.D-draft-ietf-
-                                                          --------
+    traverses a path between two points in the network [I-D.ietf-
+                                                         ++

Matt Joras flagged these broken references to draft-ietf-ippm-ioam-data
in his GenART review; I guess they were not fixed in -10 after all?

-------------------------------------------------------------------------------

All comments below are about very minor potential issues that you may choose to
address in some way - or ignore - as you see fit. Some were flagged by
automated tools (via https://github.com/larseggert/ietf-reviewtool), so there
will likely be some false positives. There is no need to let me know what you
did with these suggestions.

Section 3.2.1, paragraph 3, nit:
-    via an arbitrary segment list <S1, S2, S3>, it needs to initiate
+    via an arbitrary segment list <S1, S2, S3>, it needs to initiate a
+                                                                    ++

Section 1.3, paragraph 13, nit:
> convenient. * (payload) represents the the payload of the packet. 2. OAM Mech
>                                    ^^^^^^^
Maybe you need to remove one determiner so that only "the" or "the" is left.

Section 3.2.2, paragraph 5, nit:
> rates a hybrid OAM mechanism using the the O-flag. Without loss of the genera
>                                    ^^^^^^^
Maybe you need to remove one determiner so that only "the" or "the" is left.

Document references draft-gandhi-spring-stamp-srpm-04, but -06 is the latest
available revision.

Document references draft-ietf-ippm-ioam-data-11, but -12 is the latest
available revision.

Document references draft-matsushima-spring-srv6-deployment-status-10, but -11
is the latest available revision.

Martin Duke (was Discuss) No Objection

Comment (2021-06-02)
Thank you for addressing the TSVART comments (and to Magnus for the review) and my DISCUSS.

I see that some of the contributors are in common, but this would appear to have substantial overlap with https://datatracker.ietf.org/doc/draft-ietf-ippm-ioam-direct-export/.

Martin Vigoureux No Objection

Comment (2021-06-01 for -10)
Hi,

I'm only reporting few nits that I think I have found. I may have a couple of questions later.


      TC: Traffic Class.
this is apparently not used in the document.
Also, IGP, OSPF and ISIS are in fact well-known abbreviations.

I find this:
   The information elements include parts of the packet header and/or
   parts of the packet payload for flow identification.
Slightly in contradiction with:
   This document does not specify the data elements that need to be
   exported

   If a node supports the O-flag, it can optionally advertise its
   potential via control plane protocol(s).
Should an Informative reference, for example to draft-ietf-idr-bgpls-srv6-ext, be added here?

The traceroute examples (Fig 3 and 4) don't seem to follow the 2001:db8:i:j:in:: convention.

In 3.1.2 and 3.2.2
   node N4 executes the SID
I think it is N2


s/2001:db8::A:k::/128/2001:db8:A:k::/128/

s/B5::/2001:db8:A:5::/

s/2001:db8:4:5::52::/2001:db8:4:5:52::/

s/node 5/node N5/ (twice)
s/node 3/node N3/

Murray Kucherawy No Objection

Comment (2021-06-02)
The shepherd writeup omits an answer to the question "Why is this the proper type of RFC?"

Although "TC" and "IS-IS" are defined in the glossary in Section 1.2, they appear in no other sections.  (It's "ISIS" elsewhere.")

In Section 1.3, I don't think you can start a sentence with "e.g."

In Section 2.1: It says, "The processing node SHOULD rate-limit the number of packets ...", but no guidance is given on what a good rate might be.

Robert Wilton No Objection

Comment (2021-06-03)
Hi,

Thanks for this document.  I would also like to thank Dan for the Opsdir review.

Similarly to how some fellow ADs have commented, I found parts of this document tricky to read, particularly around the use of IPv6 addresses and trying to understand what is fixed and what is variable.  I suspect that the terminology is clear to engineers who are very familiar with SRv6, but is perhaps more opaque to less familiar readers.  I've added a few suggestions on how this could possible be clarified (at the authors discretion):

(1) I wonder whether the diagrams would be easier to understand if perhaps the variables used angle brackets or '$' to indicate that they are a variable.  E.g. rather than: "2001:db8:i:j:in::", would it be clearer as "2001:db8:<i>:<j>:<i><n>::" or  perhaps "2001:db8:$i:$j:$i$n::"?

(2) In other cases the locator block is defined as 2001:db8:B::/48, but it isn't defined what B is.  I presume that this identifies the block, but this should be clarified, and perhaps giving a concrete example Block number might be helpful.

(3) Similarly, "A" in the loopback address donot appear to be defined.  Nor is it clear what "C" is, in the context of "Cij" or "C23".

Separately, I also have some sympathy with John's discuss ballot where a separate Stds Track doc from the OAM flag vs an Informational draft for the Ping and traceroute examples might have been clearer.  But conversely, I can also see clear benefit in having all of SRv6 related OAM work in one document.  The document may benefit having a bit more of a clear distinction between the normative parts of the document that define new functionality and the illustrative parts of the document that just provide examples.

In particular, I would suggest:
 - Changing the ordering/wording of both the abstract and introduction to describe the OAM flag first, and then the ping/traceroute examples second, which also follows the order that the concepts are discussed in the document.
 - Perhaps be explicit at the beginning of section 3 that the text is not normative and only provide examples of what the existing mechanisms look like when applied to SRv6.
 - Perhaps move section 1.3 to section 3, if I am right that the example is only applicable to section 3?

A couple of comments related to PSP handling:

In section 2.1.1:
   If the telemetry data from the ultimate segment in the segment-list
   is required, a penultimate segment SHOULD NOT be a Penultimate
   Segment Pop (PSP) SID.  When the penultimate segment is a PSP SID,
   the SRH will be removed and the O-flag will not be processed at the
   ultimate segment.

Would "cannot" be better than "SHOULD NOT"?  I.e., isn't this more of this just won't work rather than this won't interoperate?

Similarly, in section 3.1.1:
      The penultimate node (Node N4)
      does not, should not and cannot differentiate between the data
      packets and OAM probes.

I would suggest changing "does not, should not and cannot" to just "cannot".

Finally, it wasn't clear to me how the controller N100 is connected to topology.  Normally, I would assume that this would be over a separate management network, but then in section 3.4 it talks about loopback probes.  Hence, it wasn't clear to me whether the expectation was to be bridging data plane traffic to the management network (which seems unwise), or more likely you would have a separate dataplane connection to the controller for the loopback probes.  Adding some text to section 3.4 to clarify this may be helpful for readers.

Thanks,
Rob

Warren Kumari No Objection

Comment (2021-06-02)
Thank to Dan for his OpsDir review.

Éric Vyncke (was Discuss) No Objection

Comment (2021-06-02)
Top posting: thank you for the quick fix on my two DISCUSS points (I told you there were easy to fix). I also like the added text to address Martin Duke's DISCUSS point.  I kept the non-blocking COMMENT points below.
----

Thank you for the work put into this document. It is comforting (even if not surprising) that the simple "good old" ping/traceroute work on a SRv6 network ;-)

Thanks to Carlos Bernardos for his INT-REVIEW at 
https://datatracker.ietf.org/doc/review-ietf-6man-spring-srv6-oam-10-intdir-telechat-bernardos-2021-05-28/

Thanks to Ole Trøan for his shepherd document even if I regret the lack of justification for 'standards track'. Especially, because the abstract is mainly about ping/traceroute, hence should be informational but the O-flag is indeed standard track. So, all in all, this is OK.

Please find below some non-blocking COMMENT points (but replies would be appreciated), and one nit.

I hope that this helps to improve the document,

Regards,

-éric

== COMMENTS ==

Is there any reason not to follow RFC 5952 about IPv6 address representation? I.e., not using uppercase ;-) (you may use uppercase for the 'variable' such as k). I understand that changing the case is a long and cumbersome endeavor... This comment is of course non-blocking.

About the O-flag, as this I-D is about OAM, I would have expected that the document specifies some operational recommendations, e.g., collecting statistics about O-flag processing: packet count, requests ignored, ...

-- Section 1 --
In the first sentence, is it RFC 8402 or RFC 8754 ?

-- Section 1.3 --
I was about to raise a DISCUSS on this one... the abstract and introduction is about SRv6 and this section uses network programming example with END.X. Suggest to either modify abstract / introduction to mention RFC 8986 or simplify the example by not using END.X (e.g., not mentioning END.X as the plain SRH adj-sid behavior is END.X -- no need IMHO to introduce the network programming nomenclature).

-- Section 2.1 --
Not important and feel free to ignore, but, while telemetry operation is important for OAM, OAM is broader than plain "telemetry data collect and export" (IMHO). I would have preferred the use of 'telemetry marking' for example. But, I guess it is too late to change the O-flag into a T-flag ;-)

In "packet header", is the layer-2 header included ? IPFIX can export layer-2 information, hence my question. Perhaps better to use "IP header" here ?

-- Section 2.1.1 --
I was again about the raise a DISCUSS on this point, S01.1 appears to be applicable to SRH/RFC 8754 while the text about PSP is clearly about net-pgm/RFC 8986. How can we reconciliate this ?

Finally, in the case of PSP, should the normative pseudo-code be changed by introducing another 'if' in the pseudo-code ?

-- Section 3.1.1 --
The figure 2 seems to have an incoherent 'screen shot' as 2001:db8:A:5:: is used as the ping target but the output of the ping displays "B5::". What did I miss ?

The node N4 is assumed to "performs the standard SRH processing" but later it needs to process a "PSP SID", which is not standard SRH RFC but in the net-pgm one.

-- Section 3.2.1 --
I wonder whether "These ICMPv6 responses are IP routed." is really useful here as plain IP routing will be applied (or do you mean no using SRH in the reply?).

The example uses "DA" while I would expect that this would be the "SA" of the received ICMP messages. But, this is cosmetic.

-- Section 3.2.2 --
What is a "classic IPv6 node" ? I guess it is a 'non SRv6-capable node' => to be added in the terminology section ?

-- Section 3.3 --
"The local OAM process sends a full or partial copy" it really smells like a postcard OAM while IPFIX can be used to send aggregated data, which is also very useful. All in all, if this is a local send to another process, then worth mentioning it.

== NITS ==

-- Section 1.3 --
As figure 1 uses a double border for SRv6-capable nodes, let's mention it in the text.

Zaheduzzaman Sarker No Record