Early Review of draft-ietf-i2rs-architecture-07
review-ietf-i2rs-architecture-07-secdir-early-kaufman-2015-01-02-00

Request Review of draft-ietf-i2rs-architecture
Requested rev. no specific revision (document currently at 15)
Type Early Review
Team Security Area Directorate (secdir)
Deadline 2016-03-15
Requested 2014-12-18
Draft last updated 2015-01-02
Completed reviews Genart Early review of -06 by Russ Housley (diff)
Genart Last Call review of -13 by Russ Housley (diff)
Secdir Early review of -07 by Charlie Kaufman (diff)
Opsdir Early review of -07 by Fred Baker (diff)
Rtgdir Early review of -06 by Russ White (diff)
Rtgdir Early review of -09 by Carlos Pignataro (diff)
Assignment Reviewer Charlie Kaufman
State Completed
Review review-ietf-i2rs-architecture-07-secdir-early-kaufman-2015-01-02
Reviewed rev. 07 (document currently at 15)
Review completed: 2015-01-02

Review
review-ietf-i2rs-architecture-07-secdir-early-kaufman-2015-01-02

I have reviewed this document as part of the security directorate's ongoing
effort to review all IETF documents being processed by the IESG.  These
comments were written primarily for the benefit of the security area
directors.  Document editors and WG chairs should treat these comments just
like any other last call comments.

As an ?architecture? document, this document does not specify the details
that would allow one to review whether the security mechanisms were
adequate. The Security Considerations section correctly notes that there is
a need to transport this protocol over something that provides mutual
authentication, confidentiality, and integrity to the data. It also notes
that there needs to be some authorization mechanism that configures which
authenticated clients are allowed to make what requests. There is no
discussion of where this authorization comes from, and in particular whether
the authorization data can be viewed or manipulated using this protocol,
though my sense reading the document is that authorization data would be
configured and manipulated by some other mechanism (as would the
manipulation of client and server credentials). So I think we need to wait
for the successor document with more meat to judge.

That said, I would ask the designers some leading questions of the form
?Have you considered??. Some relate to security and some don?t. I?m not the
best person to judge the answers, but I?m hoping the questions will kick off
some discussion within the working group. It's likely that some of these
issues have already been adequately discussed, in which case feel free to
ignore them.

This document goes out of its way *not* to specify any security mechanisms
in order to provide flexibility to implementers. That makes sense for a
requirements document, but I'm not sure it makes sense for an architecture
document. You are clearly going to need some security mechanisms, and for
clients and agents to interoperate, they need to be standardized. My guess
is that you will end up using SSL with either client certificates or with
some lesser client to agent authentication mechanism inside an SSL
connection with only a server certificate. The mechanism you choose will
determine the formats of the identity information you get and use to do
lookups in your authorization tables. But section 7.1 says the protocol may
need to run over TCP, SCTP, DCCP, and possibly other link types. Do you
envision different security mechanisms for the different protocols?

In the third paragraph of section 4 (and in some other places), you talk
about the I2RS Client acting as a broker forwarding requests for some other
entity, and forwarding some opaque identifier of that requesting entity to
the I2RS Agent for logging. This presumes that the I2RS is configured with
(or has access to) the authorization information that says which requestors
are permitted to do which operations. A useful extension to the protocol
would be to be able to forward a requestor-identity string that the Agent
not only logs but also checks for proper authorization before performing the
requested operation. The Agent would need to verify that both the Client and
the client asserted identity of the requestor be authorized to perform the
operation. This relatively simple change to the Agent and the protocol might
permit a considerably simpler client (if this brokered-request behavior is
actually common).

Section 1.1 says I2RS is described as an asynchronous programmatic
interface. Asynchronous usually means that you can launch operations and
then check back later whether they successfully completed. If you want to
execute a second operation only if a first succeeds (or to guarantee the
order in which they execute), you need to at some point wait for operations
to complete. There is also substantial overhead in supporting asynchronous
operation in that all transactions need labels so that they can be queried?
Have you done that? A conceptually simpler strategy is to say that since a
client can make multiple parallel connections to an agent that in cases
where a client wants asynchronous operation he opens multiple connections
and launches one asynchronous operation on each. The cost is that is has
lower performance in cases where there are large numbers of parallel
operations tying up lots of connection state.

Section 6.2: The restriction that this protocol injects only ephemeral state
seems surprising, especially given that the circumstances under which the
ephemeral state is lost are defined in terms of a network device reboot.
Some network devices may not have a clear notion of a reboot, or might do it
so rarely as to render such functionality useless. I was confused by the
discussion of agent reboots vs. device reboots. The first paragraph seems to
say that ephemeral state is lost when the device reboots, but 6.2.1 seems to
imply that state is lost when the agent reboots. The sentence ?Just as
routing state will usually be removed shortly after the failure is
detected?? seems to imply that ephemeral state might be lost when a client
reboots. Have you considered what happens to state when a client disappears
but the agent and server stay around forever. There is an option later in
the document for some sort of timeout, but I would think there would be some
sort of mechanism to guarantee that all ephemeral state disappears
eventually unless the requestor is still around implicitly renewing it.

Also in 6.2.1, it appears that one piece of state is explicitly not
ephemeral... the agent keeps a non-ephemeral list of clients to notify when
ephemeral state is lost. If the client is not accessible, for how long does
the agent continue to try to contact it? Forever?

The protocol requires that agents be able to open connections to clients (in
addition to clients being able to open connections to agents). This will
introduce lots of challenges. It means the client needs an open port to
accept connections, likely an SSL certificate, and will be in trouble if it
is behind a NAT or is mobile and does not have a stable IP address. Other
parts of the spec mention that two entities might have the same client
identity. In such cases, it will be tricky for the agent to connect to "the
right instance". It might be better to only allow clients to initiate
connections to agents, possibly with some sort of unauthenticated
notification from agent to client that initiating such a connection would be
a good idea (to reduce the overhead of the polling that would otherwise be
necessary).

My first question when I started reading this document was why do we need a
new protocol. Wouldn't SNMP or NETCONF do this just fine? And there are
probably lots of others. Section 3.1 says "There have been many efforts over
the years to improve the access to the information available to the routing
and forwarding system." It would be good to understand why those efforts
failed before inventing some new syntax (when it is unlikely the syntax is
what killed previous efforts). Then section 7.1 says the protocol will be
"based on" NETCONF and RESTCONF. What does "based on" mean in this context?

Section 7.8 talks about "collisions", but it wasn't clear (at least to me)
whether these were collisions in the time sense where two requests are made
simultaneously by different clients vs. whether it is a case where once
client tries to override the setting of another client. I also wonder
whether there are cases where two changes would interact in some way other
than one of them winning, as when two clients each want to increment the
bandwidth of some virtual like over which they are both tunneling traffic
(and where the correct result is to add the two increments).

The last paragraph of 7.9 says "the protocol will include an explicit reply
to modification or write operations even when they fully succeed". How does
this relate to the asynchronous nature of the protocol?

Good luck with this!

	--Charlie