Ballot for draft-ietf-ippm-ioam-data

Comment (2021-03-23 for -12) Sent

[ section 5.4 ]

* "deployed in a particular IOAM," -> "deployed in a particular IOAM-Domain,"
  perhaps

* I think perhaps it should be made clear, especially for the Incremental
  Trace Option, that dynamic insertion of data might need to be moderated
  "subject to any protocol constraints of the encapsulating layer", as it
  were.

Comment (2021-06-24 for -14) Sent for earlier

Thank you for addressing my comments.

Francesca

Comment (2021-08-11 for -14) Sent for earlier

Thanks for cleaning up the problems I mentioned about Section 8.  It's much more complete now.

I support Lars' DISCUSS, since it reminds me of the painful problems SPF used to have with its DNS migration plan.

"SFC" appears in the glossary in Section 3, but nowhere else in the document.

Comment (2021-10-04 for -15) Sent for earlier

Thank you to Shawn Emery for the SECDIR review.

Thanks for addressing my DISCUSS and COMMENT feedback.

I support Ben Kaduk’s DISCUSS position.

Comment (2021-03-25 for -12) Sent

I wanted to discuss about
- IOAM domain isolation and boundary
.- needed integrity. It good that there is already intention to work with integrity however, I felt the lack of integrity and impact of that needed to be discussed already for this document.

However, my fellow ADs have already put DISCUSSes on those. I support Roman Danyliw and Benjamin Kaduk's DISCUSSes.

Another major issue - the use of IOAM-Namespaces to provide domain isolation needs to be clarified. It is described that IOAM-Namespaces can be used to filter the IOAM-Domain. However, it is also described that the same node can have different roles ( encapsulating, dencapsulating and transit) for different Namespace. If both of the description is true then I don't think filtering based on IAOM-Namesapces works or I dont understand the filtering concept here.

* Section 3:

Should use the updated RFC 8174 version of the BCP 14 boilerplate.

Nit:
E2E Edge to Edge
PMTU Path MTU
Missing ":", where rest of the entries have them

Geneve: Generic Network Virtualization Encapsulation
[I-D.ietf-nvo3-geneve]
Should refer to RFC 8926.

* Section 4:
The operator has to consider the
potential operational impact of IOAM to mechanisms such as ECMP
processing (e.g. load-balancing schemes based on packet length could
be impacted by the increased packet size due to IOAM)..

is there any guideline available? I haven't seen any discussion in the security considerations on this matter either.

IOAM control points: IOAM-Data-Fields are added to or removed from
the live user traffic by the devices which form the edge of a domain.

what is live user traffic here? does this suppose to mean current traffic in transition within the IOAM domain?

Devices which form an IOAM-Domain can add, update or remove IOAM-
Data-Fields. Edge devices of an IOAM-Domain can be hosts or network
devices.

"hosts" need to clarified more, host of what?

* Section 5.1 and 5.1.1 :
I cannot find how a POT-profile is defined hence understanding of section 5.1 and section 5.1.1 is troublesome. It would be good have description of the POT-profile prior to talk about it. If this is defined somewhere else then need to refer to that.

* Section 5.4.1:

NodeLen: 5-bit unsigned integer. This field specifies the length of
data added by each node in multiples of 4-octets, excluding the
length of the "Opaque State Snapshot" field.

I would suggest to add reference to the section where the"Opaque state snapshot" is defined.

* Section 6.3 :

Epoch:

The epoch is 1 January 1970 00:00:00 TAI, which is 31 December
1969 23:59:51.999918 UTC.

AFAIK, the UNIX epoch is 1 January 1970 (midnight UTC/GMT). is there any reference to the Epoch used here?

* Section 7:

Needs a reference to IPFIX

Comment (2021-03-25 for -12) Sent

Thank you for the work put into this document and thank you for acknowledging my comments and advices (as they were minor, I am not recusing myself). Sometimes the text is difficult to read as paragraphs and sentences are long and somehow repetitive.

Please find below some non-blocking COMMENT points (but replies would be appreciated), and some nits. Like Alvaro, I was hesitating to ballot a DISCUSS on one point about section 4 (insertion/deletion of data on the flight).

I hope that this helps to improve the document,

Regards,

-éric

== COMMENTS ==

-- Abstract --
"a path between two points in the network" does this mean that this document cannot be used with a multicast destination address ?

-- Section 2 --
Is the list about 'contributors' (as in the section title) or 'co-authors' (as in the text) ?

-- Section 3 --
Please use the BCP14 boilerplate

Please note that Geneve is now RFC 8926.

-- Section 4 --
Should the 'deployment domain' include a reference to RFC 8799 ?

I was about to ballot a DISCUSS on this one: While the actual encapsulation is out of scope, the definition of "IOAM control points" alludes to nodes at the edge and in the core being able to add/remove data fields. This behavior will obviously have some impacts on the PMTU discovery and possibly on the handling of ICMP. Did the authors think of writing a generic section in this document on how this can be done in a correct way?

What is "live user traffic" ? I guess that this is not about end-user real-time video ;-) but a better wording would be welcome.

I am afraid that "ships in the night" does not apply here as all ships are usually on the same layer ;-) (planes do flight at different flight levels) But, this is not that important. No need to reply.

-- Section 5.1 --
Is it expected to have additional IOAM data types than the 4 in this document? Text should clarify this.

-- Section 5.2 --
"A transit node MUST NOT add new IOAM-Option-Types" seems to contradict the "IOAM control points" definition of section 4.

-- Section 5.3 --
If 0x0000 is already reserved, then I would suggest to make it part of the IANA range (i.e., making IANA range 0x0000 t 0x7FFF).

-- Section 5.4.2.5 --
Should there be a precise definition (or reference) of "time in nanoseconds the packet spent in the transit node"? E.g., between first bit received and last bit sent ?

-- Section 5.4.2.7 --
Is there a reason why the queue depth is expressed in buffers and not in packets? (Both metrics have useful values imho)

-- Section 5.5 --
In "Random: Unique identifier for the packet" how are collisions resolved? Do they matter at all ?

How can a transit/decaps node can handle the PoT (and also the E2E of section 5.6) as the length is not specified in the header ?

== NITS ==

Usually "e.g." is enclosed between commas.

The sentence "The definition of how IOAM-Data-Fields are encapsulated into other protocols is outside the scope of this document." or a minor variation of it occurs multiple times in the document. Please consider avoiding those repetitions.

Yes (for -12) Unknown

No Objection (2021-03-23 for -12) Sent

I have several significant issues to bring up. Individually, none of them raise to the point of a DISCUSS, so I'm balloting No Objection.

(1) §5.3: "Any IOAM-Namespace MUST interpret the IOAM-Option-Types and associated IOAM-Data-Fields per the definition in this document." This sentence seems to not say much beyond requiring compliance with this document, which is obvious since this document is the one defining the IOAM-Namespaces.

(2) §5.3: "IOAM-Namespace identifiers MUST be present and populated in all IOAM-Option-Types." What does "populated" mean? I assumed that it meant that the field has to have a value in it, but then I found out that the Default-Namespace-ID is 0x0000. This seems to mean that the receiver can't tell the difference between the sender forgetting to populate the field and the default. IOW, it seems to me that requiring that the field be populated doesn't serve a specific need and cannot be normatively enforced.

(3) §5.3: Given the example, I don't think this statement is true:

"...node identifiers might not be unique for other organizational reasons,
such as after a merger of two formerly separated organizations), the
combination of node_id and Namespace-ID will always be unique."

It seems to me that for the same reason that merged organizations can end up with overlapping node_ids, they can also end up with overlapping Namespace-IDs. If not deployed in an overlapping way (on the same set of nodes), it seems that it may be ok to have the same Namespace-ID in multiple places.

This deployment consideration should be better explained in this document. I took a quick look at draft-brockners-opsawg-ioam-deployment and this item is not covered there either.

(4) I think that the encapsulation-independent pieces of draft-brockners-opsawg-ioam-deployment would be better placed in this document.

(5) §5.4: "Any deployment MAY choose to configure and support one or both of the following options." This sentence looks like a statement of fact and not a normative statement. s/MAY/may

(6) §5.4.1: "The Namespace-ID value of 0x0000 is defined as the "Default-Namespace-ID" (see Section 5.3) and MUST be known to all the nodes implementing IOAM." The second part ("MUST be known...") is not needed because this document is defining the default value...

(7) §5.4.1:

An IOAM encapsulating node MUST set NodeLen.

A node receiving an IOAM Pre-allocated or Incremental Trace-Option
relies on the NodeLen value, or it can ignore the NodeLen value
and calculate the node length from the IOAM-Trace-Type bits (see
below).

Assuming that the NodeLen is not 0 (i.e. set), when can the receiver ignore it? If NodeLen is 0 (not set), what should the receiver do? The text above requires NodeLen to be set, but then it makes its use optional.

(8) §5.4.1: "RemainingLen: ...the sender MAY set the initial value of RemainingLen according to the number of node data bytes allowed before exceeding the MTU. ... When node data is added, the node MUST decrease RemainingLen by the amount of data added."

This text includes a normative inconsistency. If the sender doesn't initialize the "value of RemainingLen according to the number of node data bytes allowed" (because it is optional!), and the transit node does "decrease RemainingLen by the amount of data added" (because it is required), then the value won't reflect the data space remaining and a downstream node may add enough bits to overflow -- or not add anything because it considered the overflow imminent.

(9) §5.4.1: "If RemainingLen in a pre-allocated trace option exceeds the length of the option, as specified in the preceding header, then the node MUST NOT add any fields." I didn't find "the length of the option" defined anywhere. Did I miss it?

(10) §5.4.1:

Bit 12-21 Undefined. An IOAM encapsulating node MUST set the
value of each of these bits to 0. If an IOAM transit
node receives a packet with one or more of these bits set
to 1, it MUST either:

1. Add corresponding node data filled with the reserved
value 0xFFFFFFFF, after the node data fields for the
IOAM-Trace-Type bits defined above, such that the
total node data added by this node in units of
4-octets is equal to NodeLen, or

This first option assumes that all undefined data fields will be 4-octets long. But if future data fields are defined to be 8-octets (and the transit node doesn't understand them) then 0xFFFFFFFF won't be enough and there will be a misalignment. IOW, I don't think it is possible to require this behavior -- unless it is also required that all future data fields have to be 4-octets long.

(11) Some of the values to be included in the data fields (§5.4.*) are not well defined and could result in nodes reporting inconsistent measurements. Specifically, transit delay and queue depth. It would be ideal if the document provided some guidance for the implementation/use.

(12) §8.7: "Upon a new allocation request, the responsible AD will appoint a designated expert, who will review the allocation request." This sentence is not needed: the assignment happens only once and not when whenever "a new allocation request" comes up.

No Objection (2021-12-08 for -16) Sent

Many thanks for the many updates in the -16, and my apologies for
the delay in reballoting.

I'm happy to report that the changes all look good to me, but do
have one comment that seems to remain valid:  In Section 6.3 where
we discuss the POSIX-based timestamp format, I think we do need to
say that it is affected by leap seconds (analogously to how we do
for the NTP format), per the guidance in RFC 8877.  I see in the
email thread that there was some desire to defer to 8877 rather than
try to reproduce a lot of it here (and inevitably end up with an
incomplete treatment); in that case, perhaps the Section 6.2 discussion
should be trimmed so that we provide an analogous level of detail
on all three timestamp formats.

No Objection (2021-03-25 for -14) Sent

Section 1, paragraph 4, comment:
>    IOAM use cases and mechanisms have expanded as this document matured,
>    resulting in additional flags and options that could trigger creation
>    of additional packets dedicated to OAM.  The term IOAM continues to
>    be used for such mechanisms, in addition to the "in-situ" mechanisms
>    that motivated this terminology.

Suggest to rephrase this expanded view on IAM in a way that does not tie the
description to the time period during which this soon-to-be-archival document
was edited.

Section 5.2, paragraph 6, comment:
>    A transit node MUST ignore IOAM-Option-Types that it does not
>    understand.  A transit node MUST NOT add new IOAM-Option-Types to a
>    packet, MUST NOT remove IOAM-Option-Types from a packet, and MUST NOT
>    change the IOAM-Data-Fields of an IOAM Edge-to-Edge Option-Type.

I'm surprised that IOAM data isn't authenticated or even integrity-protected at
all. Relying on RFC2119 language alone seems a pretty weak protection.

Section 5.3, paragraph 9, comment:
>    Namespace identifiers allow devices which are IOAM capable to
>    determine:
>
>    o  whether IOAM-Option-Type(s) need to be processed by a device: If
>       the Namespace-ID contained in a packet does not match any
>       Namespace-ID the node is configured to operate on, then the node
>       MUST NOT change the contents of the IOAM-Data-Fields.
>
>    o  which IOAM-Option-Type needs to be processed/updated in case there
>       are multiple IOAM-Option-Types present in the packet.  Multiple
>       IOAM-Option-Types can be present in a packet in case of
>       overlapping IOAM-Domains or in case of a layered IOAM deployment.
>
>    o  whether IOAM-Option-Type(s) has to be removed from the packet,
>       e.g. at a domain edge or domain boundary.

I'll note that cryptographically authenticating IOM data would probably result
in a system that wouldn't need the concept of namespaces, because keys would
automatically serve that purpose. (A device can't update an IOAM data item if it
doesn't have the key to authenticate the update with.)

Section 5.4, paragraph 11, comment:
>    o  Time of day when the packet was processed by the node as well as
>       the transit delay.  Different definitions of processing time are
>       feasible and expected, though it is important that all devices of
>       an in-situ OAM domain follow the same definition.

I think "important" is an understatement, this seems required? Also, capturing
time-of-day seems to require synchronized clocks.

Section 5.4.2.12, paragraph 2, comment:
> 5.4.2.12.  buffer occupancy
>
>    The "buffer occupancy" field is a 4-octet unsigned integer field.
>    This field indicates the current status of the occupancy of the
>    common buffer pool used by a set of queues.  The units of this field
>    are implementation specific.  Hence, the units are interpreted within
>    the context of an IOAM-Namespace and/or node-id if used.  The authors
>    acknowledge that in some operational cases there is a need for the
>    units to be consistent across a packet path through the network,
>    hence it is RECOMMENDED for implementations to use standard units
>    such as Bytes.

There are other "standard units" here, such as packets. You'd need to recommend
a specific standard unit and not just give an example.

Section 5.5, paragraph 3, comment:
>    o  Random: Unique identifier for the packet (e.g., 64-bits allow for
>       the unique identification of 2^64 packets).

If this identifier is supposed to be unique, it can't be random. And if it's
random, it will only be able to statistically uniquely identify a much smaller
number of packets (birthday paradox).

Section 6.3, paragraph 11, comment:
>       Microseconds: specifies the fractional portion of the number of
>       seconds since the epoch.
>
>       + Size: 32 bits.
>
>       + Units: the unit is microseconds.  The value of this field is in
>       the range 0 to (10^6)-1.

Given that the max. value for microseconds is 999999, using a 32-bit field
leaves the top eight bits unused.

-------------------------------------------------------------------------------
All comments below are very minor change suggestions that you may choose to
incorporate in some way (or ignore), as you see fit. There is no need to let me
know what you did with these suggestions.

Dan Romascanu's Gen-ART review
(https://mailarchive.ietf.org/arch/msg/gen-art/vzngkYWy-W-f0PHqAPNRlyNSwnw/)
contains other nits that I wanted to make sure you were aware of.

"Abstract", paragraph 2, nit:
-    protocols such as NSH, Segment Routing, Geneve, IPv6 (via extension
-                                                        ^^^^^^^^^^^^^^^
-    header), or IPv4.  In-situ OAM can be used to complement OAM
-   ---------
-    mechanisms based on e.g.  ICMP or other types of probe packets.
-                            ^
+    protocols such as NSH, Segment Routing, Geneve, IPv6,
+                                                        ^
+    or IPv4.  In-situ OAM can be used to complement OAM
+    mechanisms based on, e.g., ICMP or other types of probe packets.
+                       +     ^

Section 1, paragraph 2, nit:
-    in [RFC7799] IOAM could be portrayed as Hybrid Type 1.  IOAM
-    mechanisms can be leveraged where mechanisms using e.g.  ICMP do not
-                                                           ^
+    in [RFC7799], IOAM could be portrayed as Hybrid Type 1.  IOAM
+                +
+    mechanisms can be leveraged where mechanisms using, e.g., ICMP, do not
+                                                      +     ^     +

Section 4, paragraph 3, nit:
-    expected that each such encapsulation will be defined in the relevant
-                                           ^ ^
+    expected that each such encapsulation would be defined in the relevant
+                                           ^^ ^

Section 4, paragraph 4, nit:
-    IOAM data does not leak beyond the edge of an IOAM domain using,for
+    IOAM data does not leak beyond the edge of an IOAM domain using, for
+                                                                    +

Section 4, paragraph 4, nit:
-    processing (e.g.  load-balancing schemes based on packet length could
-                    ^
-    be impacted by the increased packet size due to IOAM), path MTU (i.e.
+    processing (e.g., load-balancing schemes based on packet length could
+                    ^
+    be impacted by the increased packet size due to IOAM), path MTU (i.e.,
+                                                                         +

Section 4, paragraph 4, nit:
-    message handling (i.e. in case of IPv6, IOAM support for ICMPv6 Echo
+    message handling (i.e., in case of IPv6, IOAM support for ICMPv6 Echo
+                          +

Section 4, paragraph 7, nit:
-    are encapsulated into "parent" protocols, like e.g., NSH or IPv6 is
+    are encapsulated into "parent" protocols, like, e.g., NSH or IPv6 is
+                                                  +

Section 4, paragraph 8, nit:
-    the-night model, i.e. IOAM-Data-Fields in one layer are independent
+    the-night model, i.e., IOAM-Data-Fields in one layer are independent
+                         +

Section 5.1, paragraph 3, nit:
-    different categories.  In IOAM these categories are referred to as
+    different categories.  In IOAM, these categories are referred to as
+                                  +

Section 5.2, paragraph 2, nit:
-    transit nodes".  The role of a node (i.e. encapsulating, transit,
+    transit nodes".  The role of a node (i.e., encapsulating, transit,
+                                             +

Section 5.2, paragraph 3, nit:
-    called the "IOAM encapsulating node", whereas a device which removes
-           ^^^
-    an IOAM-Option-Type is referred to as the "IOAM decapsulating node".
-                                          ^^^
+    called an "IOAM encapsulating node", whereas a device which removes
+           ^^
+    an IOAM-Option-Type is referred to as an "IOAM decapsulating node".
+                                          ^^

Section 5.2, paragraph 7, nit:
-    means that an IOAM node which is e.g. an IOAM-decapsulating node for
+    means that an IOAM node which is, e.g., an IOAM-decapsulating node for
+                                    +     +

Section 5.3, paragraph 6, nit:
-    assigned range is intended to be domain specific, and managed by the
-                                           ^
+    assigned range is intended to be domain-specific, and managed by the
+                                           ^

Section 5.3, paragraph 9, nit:
-       e.g. at a domain edge or domain boundary.
+       e.g., at a domain edge or domain boundary.
+           +

Section 5.4, paragraph 2, nit:
-    deployment all nodes in an IOAM-Domain would participate in IOAM and
+    deployment, all nodes in an IOAM-Domain would participate in IOAM and
+              +

Section 5.4, paragraph 2, nit:
-    ways to deal with situations where the PMTU was underestimated, i.e.
+    ways to deal with situations where the PMTU was underestimated, i.e.,
+                                                                        +

Section 5.4, paragraph 10, nit:
-       i.e. ingress interface.
+       i.e., ingress interface.
+           +

Section 5.4, paragraph 11, nit:
-       i.e. egress interface.
+       i.e., egress interface.
+           +

Section 5.4.1, paragraph 2, nit:
-    i.e. an IOAM transit node MUST NOT modify the Namespace-ID, NodeLen,
+    i.e., an IOAM transit node MUST NOT modify the Namespace-ID, NodeLen,
+        +

Section 5.4.2.2, paragraph 6, nit:
-    with ingressing or egressing packets, i.e. ingress_if_id could
+    with ingressing or egressing packets, i.e., ingress_if_id could
+                                              +

Section 5.6, paragraph 9, nit:
-                encapsulating node e.g. by n-tuple based classification
+                encapsulating node, e.g., by n-tuple based classification
+                                  +     +

Section 5.6, paragraph 10, nit:
-                encapsulating node e.g. by n-tuple based classification
+                encapsulating node, e.g., by n-tuple based classification
+                                  +     +

No Objection (for -12) Not sent