Deterministic Networking Architecture
draft-ietf-detnet-architecture-12

Summary: Has 4 DISCUSSes. Needs 3 more YES or NO OBJECTION positions to pass.

Alissa Cooper Discuss

Discuss (2019-02-20 for -11)
= Section 6 =

"DetNet is provides a Quality of Service (QoS), and as such, does not
   directly raise any new privacy considerations."

This seems like a false statement given the possibility that DetNet may require novel flow IDs and OAM tags that create additional identification and correlation risk beyond existing fields used to support QoS today.
Comment (2019-02-20 for -11)
I support Benjamin's DISCUSS.

I agree with others that this document should be informational.

= Section 3.1 =

"There are, of course, simpler methods available (and employed, today)
   to achieve levels of latency and packet loss that are satisfactory
   for many applications."

I think this paragraph would make more sense if it said "specific levels of latency and packet loss for particular applications." A lot of applications have satisfactory performance without any of the methods/techniques described.

"Prioritization and over-provisioning is one
   such technique."
   
It seems these are two techniques, not one.

= Section 3.2.1.2 =

s/sensitive/time-sensitive/

I can't parse this sentence:

'In general, users are encouraged to use, instead of, "do this when you
   get the packet," a combination of:

   o  Sub-microsecond time synchronization among all source and
      destination end systems, and

   o  Time-of-execution fields in the application packets.'

= Section 3.2.2.2 =

s/ Either of these functions/ Any of these functions/

"Providing sequencing information to the packets of a DetNet
      compound flow.  This may be done by adding a sequence number or
      time stamp as part of DetNet, or may be inherent in the packet,
      e.g., in a higher layer protocol, or associated to other physical
      properties such as the precise time (and radio channel) of
      reception of the packet."
    
How do multiple connected DetNet nodes know which fields they are supposed to use as the packet sequence number? 

= Section 3.3.1 =

"the highest-priority non-DetNet packet is also ensured a worst-case latency." --> Did you mean "ensured less than or equal to a worst-case latency"?

= Section 4.3.2 = 

If applications need to be altered to be run over DetNet, or if they need to be DetNet-aware, it would be useful to state that explicitly up front somewhere in this document. This is sort of implied in this section but it's not clear.

= Section 6 =

"However, the requirement for every (or almost every) node along the
   path of a DetNet flow to identify DetNet flows may present an
   additional attack surface for privacy, should the DetNet paradigm be
   found useful in broader environments."

I'm not sure what is meant by "broader environments." Is the implication that flow identification doesn't present a privacy risk within a single administrative domain? I don't think that is always true.

Benjamin Kaduk Discuss

Discuss (2019-02-19 for -11)
I note that the DETNET WG is explicitly chartered with a work item for the
"overall architecture: This work encompasses ... and security aspects".
It seems incomplete to specify an architecture for a topic such as
deterministic networking without specifically considering what threats are
and are not in scope to be protected against.  Some easy questions should
be whether the system is expected to be robust in the face of an attacker
that generates non-DetNet traffic?  Or an attacker that generates DetNet
traffic in excess of reservations?  It can even be a fine engineering goal
to produce a solution that only protects against media corruption and
hardware crashes and leaves active attacks out of scope, but the actual
intended scope of the work needs to be clear.  At the other end of the
spectrum, protecting against as potent an attacker as a malicious traffic
policer is probably a lost cause, especially if the policer is authorized
to direct remote nodes to take action to terminate "misbehaving" flows.

The referenced draft-ietf-detnet-security is not at a comparable maturity
level to this document and also fails to present a clear threat model for
the DetNet architecture.  (The section entitled "Threat Model" reads as
more of a taxonomy of threats than a model for what threats are and are not
to be addressed.)  It also presents the usage of cryptographic mechanisms
as mitigation techniques without provisioning for the prerequisties of such
mechanisms (e.g., using HMAC for message integrity protection without
mention of infrastructure for distributing the keys for keying the HMAC).
Comment (2019-02-19 for -11)
I agree with Alexey that Informational would (also) be a fine status in
which to publish this document.

Abstract

                                 DetNet operates at the IP layer and
   delivers service over sub-network technologies such as MPLS and IEEE
   802.1 Time-Sensitive Networking (TSN).

I don't know what "sub-network technologies" means.  (Should I?  Is it
defined somewhere we can reference?)

More generally, is DetNet supposed to be a "sub-layer" and/or "sub-network"
that lies between specific layers or classes of layer?  Does DetNet
itself have component "sub-layers" that provide distinct DetNet
functionality?  These are good questions to address early on in the
document so the reader is familiar with the concepts as they progress
through the document.

Section 1

                          DetNet is for networks that are under a single
   administrative control or within a closed group of administrative
   control; these include campus-wide networks and private WANs.  DetNet
   is not for large groups of domains such as the Internet.

side note: Campus-wide networks at educational institutions are basically
guaranteed to have untrusted entities participating in them, just as a
backdrop for security considerations.

Section 3.1

                        This mechanism distributes the contents of
   DetNet flows over multiple paths in time and/or space, so that the
   loss of some of the paths does need not cause the loss of any  

The failure models for which this statement is absolutely true as opposed
to probabilistically true seem rather unrealistic models of real physical
systems.

Section 3.2.1.1

   The primary means by which DetNet achieves its QoS assurances is to
   reduce, or even completely eliminate packet loss due to output packet
   contention within a DetNet node as a cause of packet loss.  [...]

editing error?

                                     Note that App-flows are generally
   not expected to be responsive to implicit [RFC2914] or explicit
   congestion notification [RFC3168].

I note that the word "implicit" does not appear in RFC 2914; it may be
worth a bit more detailed of a mapping from concept to reference.
(This text/reference also appears in Section 4.3.2.)

Section 3.2.1.2

                                                                   In
   general, users are encouraged to use, instead of, "do this when you
   get the packet," a combination of:

It seems that an architecture would be within its rights to *mandate* such
application design, rather than just encourage it.  What sorts of
exceptions would cause us to not want to mandate this design?

Section 3.2.2.2

Please expand SRLG (it is only used once, so the abbreviation itself may
not be needed at all).

Section 3.2.3

   Out-of-order packet delivery can be a side effect of distributing a
   single flow over multiple paths especially when there is a change
   from one path to another when combining the flow.  [...]

nit: comma before "especially".

   Resource allocation
           The DetNet forwarding sub-layer provides resource allocation.
           See Section 4.5.  The actual queuing and shaping mechanisms
           are typically provided by underlying subnet, these can be

nit: is this usage of "subnet" common?
Also, this comma looks to be a comma splice.

           closely associated with the means of providing paths for
           DetNet flows, the path and the resource allocation are
           conflated in this figure.

nit: Hmm, actually, is this comma *also* a comma splice?

   Operations, Administration, and Maintenance (OAM) leverages in-band
   and out-of-band signaling that validates whether the service is
   effectively obtained within QoS constraints.  [...]

nit: is there a singular/plural mismatch here ("the service"/"service" vs.
"effectively within"/"effectively obtained within")?

Section 4.1.1

This figure would have helped me a lot several sections earlier.

Section 4.1.2

   A "Deterministic Network" will be composed of DetNet enabled end
   systems, DetNet edge nodes, DetNet relay nodes and collectively
   deliver DetNet services.  DetNet relay and edge nodes are

Nit: I think this is intended to be:
A "Deterministic Network" will be composed of DetNet-enabled end
systems, DetNet edge nodes, and DetNet relay nodes, which collectively
deliver DetNet services.  DetNet relay and edge nodes are

                                                             Examples of
   sub-networks include MPLS TE, IEEE 802.1 TSN and OTN.  [...]

nit: are these sub-networks or protocols used by sub-networks?

   Distinguishing the function of two DetNet data plane sub-layers, the
   DetNet service sub-layer and the DetNet forwarding sub-layer, helps
   to explore and evaluate various combinations of the data plane
   solutions available, some are illustrated in Figure 4.  This

nit: this last comma is a comma splice.

   There are many valid options to create a data plane solution for
   DetNet traffic by selecting a technology approach for the DetNet
   service sub-layer and also selecting a technology approach for the
   DetNet forwarding sub-layer.  There are a high number of valid
   combinations.

nit: I think "large number" is more conventional prose.

Section 4.3.1

I think I'm confused about how, for these flows that "require the <foo>
feature", whether that means that the DetNet implementation must provide
<foo>, or that it is required for the application to have implemented the
<foo> feature.  A mapping (if it makes sense) to the categorization of end
systems in Section 4.2.1 would be a big help.

Section 4.3.2

   Asynchronous DetNet flows are characterized by:

   o  A maximum packet size;

   o  An observation interval; and

   o  A maximum number of transmissions during that observation
      interval.

Is there necessarily only a single tier of observation interval/rate?
(E.g., could there be a burst cap in a small interval and then a lower
overall baseline rate over large intervals?)

                     That is, while any useful application is written to
   expect a certain number of lost packets, the real-time applications
   of interest to DetNet demand that the loss of data due to the network
   is a rare event.

(I might even go with "vanishingly rare".)

Section 4.4.1

(Is there a standard reference for "Northbound"?  I know we're all used to
it, but it's probably best to have a reference if we can.)

Section 4.4.2

                                         The deterministic sequence can
   typically be more complex than a direct sequence and include
   redundancy path, with one or more packet replication and elimination
   points.  [...]

nit: "redundancy paths", plural?

Section 4.8

How does *provisioning* require knowledge of *dynamic* state?

Section 4.9

Does aggregation like this pose a risk of all the aggregatees getting
affected when one exceeds their allocation substantially (so as to also
cause the aggregate to exceed the aggregate's allocation)?

Section 6

The ability for an attacker to use QoS markings as part of traffic
correlation/inspection is not new with DetNet, but is probably still worth
mentioning explicitly.

Mirja Kühlewind Discuss

Discuss (2019-02-20 for -11)
Thanks for addressing the tsv-art review comments  (and big thanks to Michael and David!) and all the work done so far! I think the document is in good shape and I only have one minor comment that I would like to see addressed/more explicitly spelled out. However, this should be done quickly with potentially 2-3 small changes in the draft. See below.
 
Given that DetNet traffic is often assumed to be not congestion controlled, it is important that there is also some network function that makes sure the source traffic stays within the requested bandwidth limit in-order to protect non-Detnet traffic. This is to some extended discussed in section 3.3.2 but I think it should be more clearly spelled out that this would require a rate limiting function at each DetNet source/relay (tunnel ingress). Currently sec 3.3.2 says:

"Filters and policers should be used in a DetNet network to
   detect if DetNet packets are received on the wrong interface, or at
   the wrong time, or in too great a volume."

However, maybe this case of limiting non-congestion controlled traffic (in case the source in not keeping to the limits on purpose/in order to cheat, because it couldn't estimate the needed bandwidth requirement an better, or due to timely fluctuations) could be explained more clearing and the respective requirement to implement rate limiting could be state separately and more strongly...?


One related comment on this sentence in Sec 3.1: 

"As DetNet provides allocated resources (including provisioned capacity)
   to DetNet flows the use of transport layer congestion control
   [RFC2914] by App-flows is explicitly not required."

I guess congestion control should still be a requirement if the App-flow also passes not-DetNet-aware segments of the path, e.g. maybe the first hop. Usually use of congestion control for application limited flows is also not a problem if sufficient bandwidth is available. Also note that, as I stated above, the important part for not requiring congestion control is actually not only that resources are allocated but also that rate limiting is in place to make sure resources usage cannot be exceeded above the reserved allocation. Maybe this sentence could also be further clarified in the draft...? 


And then one more small comment that is also related. Sec 3.2.2.2 says: 

"If packet replication and elimination is used over paths with
   resource allocation (Section 3.2.1), ..."

My assumption was that all DetNet traffic is send over pre-allocated resources...? If that is not true that has implication on congestion control and needs some additional considerations. Can you please confirm that and maybe clarify in the draft! Thanks!
Comment (2019-02-20 for -11)
I agree with Alexey and Benjamin that this document should be informational. Informational documents can also have IETF consensus, so that cannot be the reason to go for PS. However, this document does not specify a protocol or any requirements that are mandatory to implement for interoperability and therefore should not be PS.

Alvaro Retana Discuss

Discuss (2019-02-20 for -11)
I support Mirja's and Alissa's DISCUSSes...and have a related set of concerns about the coexistence with non-DetNet traffic and privacy:

§3.3.1 talks about what I think is a hard to achieve balance between coexisting with non-DetNet traffic and keeping that traffic from disrupting DetNet flows.  Because of the constraints, the intent of prioritizing DetNet flows is clear (and that is ok), but that may result in starvation of non-DetNet traffic...even if the text does explicitly say that it "must be avoided".

I would like to see the potential case of starving non-DetNet traffic called out somewhere.  I'm looking for something similar to the first paragraph in §5, but focused on the non-DetNet traffic.


Related to the above is the fact that the identification of flows could be used to specifically *not* include some of them as DetNet flows.  This is a variation of the concern outlined in §6, but applied to non-DetNet flows, with the potential starvation mentioned above.  Again, I would like to at least see some discussion of this risk.

 
The use case and problem statement documents outline specific applications that may not have non-DetNet traffic, and the Introduction supports that.  However, the architecture described in this document may be used in more general networks to provide guarantees to specific traffic...  IOW, even if the intention is there, there is no guarantee that DetNet will only be used in the expected use cases.
Comment (2019-02-20 for -11)
I agree with and support Benjamin's DISCUSS -- also, I think that draft-ietf-detnet-security should be a Normative reference.

I also agree that Informational may be a better status.

[nit] s/sub-layer at which A DetNet service/sub-layer at which a DetNet service
[nit] Expand SRLG.

Deborah Brungard Yes

Ignas Bagdonas No Objection

Ben Campbell No Objection

Comment (2019-02-20 for -11)
I support Benjamin's and Alissa's DISCUSS positions. I also agree with the several people who commented that this should be informational. Otherwise, I have some minor comments:

§3.2.1.2, 
- first paragraph: Please expand "GigE"

- "In general, users are encouraged to use, instead of, "do this when you
get the packet," a combination of:" - Hard to parse.

§3.2.2.2: Please expand "PREOF" on first use. I realize readers can probably construct it from the previous few sentences, but it's more reader-friendly not to require them to do so.

§3.3.2: "Robust real-time systems require to reduce the number of possible
failures." - The sentence does not parse.  Are there missing words prior to "to"?

§4.1.2: 

- "Distinguishing the function of two DetNet data plane sub-layers, the
DetNet service sub-layer and the DetNet forwarding sub-layer, helps
to explore and evaluate various combinations of the data plane
solutions available, some are illustrated in Figure 4."

Hard to parse. Also, there is a comma splice.

- "There are many valid options to create a data plane solution for
DetNet traffic by selecting a technology approach for the DetNet
service sub-layer and also selecting a technology approach for the
DetNet forwarding sub-layer. There are a high number of valid
combinations."

Does this refer to implementation/deployment options, or protocol design options? If the latter, are the choices still open?

§4.1.3: Please define or expand DetNet-UNI.

§4.2.1, first paragraph: Does L1 stand for Layer-1, etc? If so, please spell it out.

Spencer Dawkins No Objection

Suresh Krishnan No Objection

Comment (2019-02-21 for -11)
* Section 3.2.1.1.

Given that we also know some of the downsides as well of large buffers, I think a pointer to some background might be warranted here. I would recommend a basic reference to something like 

Bufferbloat: Dark Buffers in the Internet, Communications of the ACM, January 2012,

* Section 3.2.2.2.

It is not obvious to me how the POF cannot be the last operation at the receiver. Can you clarify? Also, do intermediate nodes apply the POF? I can see the need for them to do PRF and PEFs but I am not sure applying the POF at intermediate nodes can necessarily help the low latency and low jitter goals.

   The order in which a DetNet node applies PEF, POF, and PRF to a
   DetNet flow is implementation specific.

* Section 3.2.3.

RFC7426 does not contain much specific information about explicit route setup. Is there a particular section you want to point to. If not, I don't think this reference is of much use. RFC8453 is listed twice.

* Section 3.3.1.

Not sure why this is a requirement but I do wish to note that there are no such worst-case latency guarantees for best effort traffic (aka non-Detnet) in current networks. Can you clarify?

   o  DetNet flows can be shaped or scheduled, in order to ensure that
      the highest-priority non-DetNet packet is also ensured a worst-
      case latency.

* Section 4.1.1.

This text "Peers with Duplicate elimination." seems to be completely out of place under the "Packet sequencing" heading below Figure 2. Copy and paste error?

* Section 4.3.2.

I found the expression "number of bit times" confusing. I have understood "bit time" to mean the amount of time taken to emit a bit from a network interface. Based on that definition, this expression does not make sense. Is there a better reference/definition of what you mean?

* Section 4.5.

There might be some other recent IETF defined mechanisms that might be relevant to mention here as well. e.g. RFC8289 (Codel), RFC8033 (PIE) etc. 

* Section 4.7.2.

While IPv6 does offer a mechanism to add/remove a flow id (flow label) not sure what kind of mapping you were thinking for IPv4. If this is not possible, I think a note to that effect might be useful here.

* Sections 5 and 6

I also support Alissa and Benjamin's DISCUSSes (on privacy and security) and would like to see them addressed.

Warren Kumari No Objection

Comment (2019-02-20 for -11)
Thank you, this is a very well written, and easy to follow.

I do have some minor comments and nits.

1: 3.2.1.2.  Jitter Reduction
"A core objective of DetNet is to enable the convergence of sensitive non-IP networks onto a common network infrastructure."
You *do* say this in the introduction, etc, but this sentence is the clearest description - consider moving it up to the top.

2: 3.2.2.  Service Protection
   "Service protection aims to mitigate or eliminate packet loss due to equipment failures, random media and/or memory faults."
This talks about memory faults, but what about (the common case) of memory corruption? AFAICT, the protocol itself doesn't do anything about this - perhaps it should mention this, and say that strong checksums / integrity is the responsibility of the upper layer protocol? "Extraordinary claims require extraordinary evidence"- Carl Sagan.

3: "The DetNet service sub-layer includes the packet replication (PRF),
   the packet elimination (PEF), and the packet ordering functionality
   (POF) for use in DetNet edge, relay node, and end system packet
   processing.  Either of these functions can be enabled in a DetNet
   edge node, relay node or end system."
This says "either" about three things.

4: "3.3.2.  Fault Mitigation
   Robust real-time systems require to reduce the number of possible
   failures."
Apologies, I don't have suggested test to fix this, but "require to reduce" reads oddly -- perhaps "require a reduction in"?

Alexey Melnikov No Objection

Comment (2019-02-19 for -11)
Why is this document is not IETF Consensus Informational?

Terry Manderson No Record

Eric Rescorla No Record

Adam Roach No Record

Martin Vigoureux No Record