Last Call Review of draft-ietf-pcp-authentication-11
review-ietf-pcp-authentication-11-genart-lc-kyzivat-2015-07-02-00

Request Review of draft-ietf-pcp-authentication
Requested rev. no specific revision (document currently at 14)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2015-06-29
Requested 2015-06-18
Authors Margaret Cullen, Sam Hartman, Dacheng Zhang, Tirumaleswar Reddy.K
Draft last updated 2015-07-02
Completed reviews Genart Last Call review of -11 by Paul Kyzivat (diff)
Genart Telechat review of -13 by Paul Kyzivat (diff)
Opsdir Last Call review of -11 by Jouni Korhonen (diff)
Assignment Reviewer Paul Kyzivat 
State Completed
Review review-ietf-pcp-authentication-11-genart-lc-kyzivat-2015-07-02
Reviewed rev. 11 (document currently at 14)
Review result Ready with Issues
Review completed: 2015-07-02

Review
review-ietf-pcp-authentication-11-genart-lc-kyzivat-2015-07-02

I am the assigned Gen-ART reviewer for this draft. For background on 
Gen-ART, please see the FAQ at 
<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other comments you may receive.

This draft is on the right track but has open issues, described in this 
review.

For the most part my issues are with confusing and/or ambiguous language 
and under-specification. This means some clued in implementers will be 
able to create functional, interoperable, implementations. But others, 
working just from the specification, may build arguably conforming 
implementations that fail to interoperate.

I've opted not to split major and minor issues out separately, because I 
thought it better to put things in document order. But please take 
special note of my comments on sections 3.1.1 and 3.1.2 that highlight 
what I believe to be a true protocol errors. I did note those points I 
consider nits.

Disclaimer: I do not claim to be a security expert. The authors are 
vastly more qualified than I on security matters. So I have not tried to 
discern if there are technical security holes in this specification.

I’ll start with some general comments, and then follow with specific 
comments on particular parts of the document.

	Thanks,
	Paul Kyzivat

* General Comments:

I assume that a reader of this document is reasonably familiar with the 
RFCs that specify PCP and EAP. (When I began this review, I was not, so 
I read them before starting this review.) I apologize if I have made 
technical mistakes in my understanding of those protocols.

There is a fundamental inconsistency of approach between PCP and EAP. 
Both are client/server protocols, but for most of the expected usage a 
PCP client will be the EAP server and visa versa. So this draft works 
hard to reconcile this difference. I *think* it has, for the most part, 
done so *technically*, but not so well with the terminology and 
language. For example, while reading the draft it was a constant 
struggle to understand whether a request is a PCP request containing 
response pertaining to EAP, or a request pertaining to EAP contained in 
a PCP response.

I have another concern about the big picture of the protocol. There is 
much material about message level details, but not much about how they 
all play together. Some things I had trouble with:

- Once a PA session is initiated, may common PCP messages that are not 
part of that session continue to be exchanged?

- Is it permissible to have more than one concurrent PA session active 
between the same PCP client and server? These might be multiple sessions 
created with the same credentials, or using different credentials. (E.g. 
to support both a "regular" and "superuser" level of privilege.)

- If multiple sessions are permitted, how to avoid accumulating many 
active sessions?

To address this, I would hope to see some state models and some example 
message sequence diagrams that show how this plays out in practice.

(Consider: PCP client starts without credentials. At some point it tries 
a request that requires authentication. It does so using "normal" 
credentials. Later, within that PA Session it tries a request that 
requires "superuser" authentication. How would this be indicated? How 
would the new authentication take place? How would the client know what 
credentials to use? How to revert back to "normal" privilege?)

Another thing: it appears to me that this document makes a change to the 
PCP protocol that was not anticipated as an extension point in RFC6887. 
Specifically, it uses part of the "Reserved" portion of the PCP request 
message. There should be no technical problem with doing this, but I 
think this means that this document should be marked as updating 
RFC6887. Ideally 6887 would be revised, splitting off the last 8 bits of 
the Reserved field as a separate field that is available for 
opcode-specific use. That would eliminate the possibility that some 
future revision of 6687 would assign that field for some other use.

Regarding lifetime: All PCP messages carry a lifetime field. But this 
seems to differ from the "session lifetime" that is passed in an option. 
I find *no* mention of the PCP lifetime in this document. IMO there 
should be *some* statement of what to do with that. ISTM that there are 
two distinct but similar things here:

- The lifetime of a PA session (validity of the session ID). Presumably 
it has to become valid upon the first PA message from the PA-Client, and 
it remains valid through initial authentication (or until failure of 
that) and through any and all re-authentications. The PA-Server can’t 
send PA messages to the PA-Client without having this session. So it 
needs to know how long it will remain valid. And this lifetime should be 
bounded so that an authentication attempt that never completed doesn’t 
leave a session forever. This is quite analogous to the lifetime of a 
mapping, and could conceivably be managed the same way, using the 
lifetime field in the PCP messages.

- The lifetime of a PA SA and associated keys. Presumably this begins at 
the successful completion of the initial EAP negotiation, and is revised 
upon a successful re-authentication. This is needed for authenticating 
Common PCP messages within the PA session.

Currently these two seem to be conflated. Even if you are happy bundling 
these into one concept then still *something* should be said about how 
the PCP lifetime field is used.

* Section 2:

Re PCP-Authentication (PA) message:

    PCP-Authentication (PA) message: A PCP message containing an
    Authentication Opcode.  Particularly, a PA message sent from a PCP
    server to a PCP client is referred to as a PA-Server, while a PA
    message sent from a PCP client to a PCP server is referred to as a
    PA-Client.  Therefore, a PA-Server is actually a PCP response message
    specified in [RFC6887], and a PA-Client is a PCP request message.

This terminology seems odd. It seems to say that the term "PA-Server" 
describes a *message*. But a name like that should seemingly describe a 
type of server, not a message. I suggest not doing this – instead, use 
"PA-Client message" and "PA-Server message" when referring to such 
messages. Instead, define and use "PA-Client" and "PA-Server" 
analogously to PCP Client and PCP Server.

(Could there be cases where the PCP Client also needs to authenticate 
the PCP Server, and the EAP method used doesn’t support mutual 
authentication? I don’t think the existing text contemplates that 
possibility, even though RFC3748 does. Supporting that would require a 
lot of changes.)

Re Common PCP message:

    Common PCP message: A PCP message which does not contain an
    Authentication Opcode.  This document specifies an Authentication Tag
    Option to provide integrity protection and message origin
    authentication for the common PCP messages.

It isn’t clear to me if this term is intended to cover all PCP messages 
not containing the Authentication opcode, or only messages that contain 
the Authentication Tag Option. The usages in the document seem to make 
different assumptions about this. I’ve assumed that it covers messages 
both with and without that option unless explicitly qualified. It would 
be helpful to have distinct terms for those with and without.

PCP device: This term is not defined in this terminology section, or 
anywhere else I can find, in this document or in RFC6887. But it is used 
frequently. IIUC it means "PCP Client or PCP Server" and is closely 
related to Session Partner. There ought to be a definition of it, or if 
appropriate, usages could be replaced with Session Partner.

* Section 3.1:

    ... Each PA message is attached with an Authentication Opcode ...

This language is awkward/confusing. I *think* you mean:

    ... Each PA message MUST contain an Authentication Opcode

(Actually, all occurrences of "attached with" deserve a look.)

Then the following:

    ... The Authentication Opcode consists of two fields: Session ID and
    Sequence Number. ...

Again this language seems odd: a PCP opcode doesn’t contain fields like 
this, though it may be *accompanied* by such fields. After reading 
ahead, I guess you mean:

    ... The opcode-specific information of each PA message contains
    two fields: Session ID and Sequence Number. ...".

* Section 3.1.1:

In the following:

    When a PCP client intends to proactively initiate a PA session with a
    PCP server, it sends a PA-Initiation message (a PA-Client message
    with the result code "INITIATION") to the PCP server.

It seems odd to have a "result code" in a *request*. (I would expect it 
to be in a *response*.) It does make *some* sense when what is being 
sent is a response to a prior EAP request, but that is not the case 
here. Here it is serving as a sub-opcode.

ISTM that using the PCP result code in PCP requests is confusing, in 
addition to being contrary to the definition of the PCP request message. 
I suggest refining the terminology and language around this. E.g., 
define appropriate new names for these fields and alias them to the PCP 
fields.

Then:

    ... From now on, every PCP message within this session
    will be attached with this session identifier. ...

Again the language usage is odd, and apparently non-normative. And what 
does "within this session" mean? This document defines PA Session, so it 
could mean that. Or it could mean PCP Session – except that AFAICT PCP 
has no notion of a PCP session.

If it means PA Session, then the statement is odd because it is a 
truism. If that is the intent, then I suggest rewording it as:

    ... Subsequent PCP messages to be included in this PA Session MUST 
contain this session identifier. ...

If this is intended to mean that once a PA Session is established then 
*all* messages must be within it, then more substantial document changes 
are needed.

Then:

    ... If the PCP client intends to simplify the
    authentication process, it MAY append an EAP Identity Response
    message within the PA-Initiation message so as to inform the PCP
    server that it would like to perform EAP authentication and skip the
    step of waiting for the EAP Identity Request.

I don’t understand how this can work. IIUC the intent is to tunnel EAP 
within PCP, not to change EAP. AFAICT EAP has no provision for sending 
an EAP response message without having first received a corresponding 
request. The response message will have to contain an EAP Identifier, 
which is normally assigned by the Authenticator. In this case that 
wouldn’t be available, so the PCP client would have to make up a value. 
The authenticator won’t be expecting a response for this identifier. (Or 
worse, it may have just happened to send a request using this identifier.)

If this behavior is desired, then I think an extension to EAP is needed 
to allow an EAP peer to preemptively provide identity. (I apologize if I 
have misunderstood EAP on this point.)

Then, regarding the message sequence diagram:

(nit): the message sequence diagram uses "EAP request" and "EAP 
response" while the text uses "EAP identity request" and "EAP identity 
response". Consistent terminology should be used.

* Section 3.1.2:

I have been studying the message sequence diagram in this section along 
with RFC6887. As best I can understand, this is not a valid sequence.

I see the PCP client sending a Common PCP request, but I see no matching 
PCP response message for that. (This should trigger retransmission of 
the request.) Instead, I see the PCP server sending a PA-Server message. 
That will have a different opcode, and so can’t be considered a response 
to the Common PCP message. If the PCP client has not previously sent a 
PA message (the case here) it is supposed to ignore this message.

I was *expecting* to see the PCP server send a PCP response to the 
Common PCP message, with result AUTHENTICATION-REQUIRED. Then, the PCP 
client to send a PA-Client message to initiate the PA session. That 
could be exactly like the sequence in 3.1.1, or it might be optimized 
somehow.

(I defer commenting on the text in this section pending clarification of 
these issues.)

* Section 3.1.3:

IIUC, PCP requires that the PCP client send a PA-client message before 
the PCP server can send any PA-server messages. Once that has been done 
the PCP server is allowed to send an arbitrary number of "responses" to 
that, enabling the exchange of EAP messages. The server messages must 
"match" the client message by opcode and opcode-specific data according 
to opcode-specific rules. (I think the intent for PA messages is that 
they must match on the session ID.)

If I’m right about this, it would be good to explain that before the 
content of this section. (It took me a lot of reading before I figured 
this out.)

Then:

    ... If a PCP device receives a PA message from its partner and
    cannot generate an EAP response immediately due to certain reasons
    (e.g., waiting for human input to construct a EAP message or waiting
    for the additional PA messages in order to construct a complete EAP
    message), the PCP device MUST reply with a PA-Acknowledgement message
    (PA message with a Received Packet Option) ...

The use of "PCP device" suggests this is intended to apply to both the 
PCP client and server. But the use of "generate an EAP response" implies 
that this must be the PA-Client, since only it sends EAP responses. And 
that is consistent with sending a PA-Acknowledgement (request) message. 
If I’m right about this, then perhaps the following would be a clearer 
wording of the above:

    ... If a PA-Client receives a PA message containing an EAP request
    and cannot generate an EAP response immediately due to certain
    reasons (e.g., waiting for human input to construct a EAP message
    or waiting for the additional PA messages in order to construct a
    complete EAP message), the PA-Client MUST send a PA-Acknowledgement
    message (PA message with a Received Packet Option) ...

Then:

    In this approach, it is mandated for a PCP client and a PCP server to
    perform a key-generating EAP method in authentication.  Particularly,
    a PCP authentication implementation MUST support EAP-TTLS [RFC5281]
    and SHOULD support TEAP [RFC7170].  Therefore, after a successful
    authentication procedure, a Master Session Key (MSK) will be
    generated.  If the PCP client and the PCP server want to generate a
    transport key using the MSK, they need to agree upon a Pseudo-Random
    Function (PRF) for the transport key derivation and a MAC algorithm
    to provide data origin authentication for subsequent PCP messages.
    In order to do this, the PCP server needs to append a set of PRF
    Options and MAC Algorithm Options to the initial PA-Server message.
    Each PRF Option contains a PRF that the PCP server supports, and each
    MAC Algorithm Option contains a MAC (Message Authentication Code)
    algorithm that the PCP server supports.  Moreover, in the first PA-
    Server message, the server MAY also attach an ID Indicator Option
    defined in Section 5.11 to direct the client to choose correct
    credentials.  After receiving the options, the PCP client selects the
    PRF and the MAC algorithm which it would like to use, and then adds
    the associated PRF and MAC Algorithm Options to the next PA-Client
    message.

In the above there appears to be quite a bit of normative behavior that 
has no 2119 language. (E.g., "it is mandated", "will be generated", "the 
PCP client selects ... then adds".) IMO this should be tightened up with 
2119 language.

The last two paragraphs of this section explain, in part, the use of 
messages with result codes AUTHENTICATION-SUCCEEDED, 
AUTHENTICATION-FAILED, and DOWGRADE-ATTACK-DETECTED. Section 3.1.1 
explains the use of AUTHENTICATION-REQUIRED. Elsewhere is explanation of 
SESSION-TERMINATED. Of those, it appears that only 
AUTHENTICATION-REQUIRED is used with messages carrying EAP messages, and 
then I only see mention of it being used with the *first* EAP message. 
Is AUTHENTICATION-REQUIRED to be used with all PA messages that contain 
EAP messages? If so, that should be specified. If not, then what should 
be used with the others?

Those two paragraphs also say to "terminate the session" in some error 
cases. Then the following section 3.2 talks about Session Termination. I 
*guess* that in the error cases you mean that local session state is 
dropped without sending a SESSION-TERMINATED request. It would be good 
to be clear about that.

* Section 3.2:

This says that upon receiving a SESSION-TERMINATED from a partner one 
must also send a SESSION-TERMINATED. Why? That could lead to an infinite 
exchange of messages. Also, once the PA SA is removed, any subsequently 
received PA message won’t match any session and so will presumably be 
dropped. More clarity here would help.

* Section 3.3:

    A session partner may select to perform EAP re-authentication if it
    would like to update the PCP SA without initiating a new PA session.
    A re-authentication procedure could be triggered for the following
    reasons:

    o  The session lifetime needs to be extended.

    o  The sequence number is going to reach the maximum value.
       Specifically, when the sequence number reaches 2**32 - 2**16, the
       session partner MUST trigger re-authentication.

(nit): IMO this would read better if you replace "select" with "elect".

Also, it is unclear (to me) whether the stated reasons are intended to 
be exhaustive, or are simply examples. I *guess* that these are 
examples, and that re-authentication can also be requested at whim. 
Would be good to be more explicit about this.

Also, what happens if re-authentication is not attempted and the session 
lifetime is reached? Is there a requirement to silently terminate the 
session?

Do the PA messages containing RE-AUTHENTICATION contain EAP messages? Or 
do they trigger sending EAP messages within PA messages having some 
other reason code?

What if "glare" occurs? (Both client and server decide to send 
RE-AUTHENTICATION at the same time?)

* Section 4:

    At the beginning of a PA session, a session SHOULD generate a PA SA
    to maintain its state information during the session.  The parameters
    of a PA SA are listed as follows:

Why SHOULD rather than MUST? How can any of this work if the SHOULD is 
violated? ISTM the only case would be if the PA-Server refuses the 
initial PA request.

And I'll restate something I mentioned above: is it permissible for a 
PCP session to have more than one concurrently active PA sessions? If 
so, each will have its own state.

Included in the listed state are *four* sequence numbers. But there is 
little mention of these elsewhere in the document. Many places simply 
say things like: "A sequence number needs to be incremented", "The 
sequence number of the last received PCP message". (Sections 6.4 and 6.5 
are more explicit.) It would be helpful for the places that are vague 
about this to be more explicit.

* Section 5.1:

I'll restate that it seems confusing to define "response codes" that are 
not identifying *responses*.

I guess that because the PA messages containing EAP messages invert the 
roles of client and server (wrt Common PCP messages), it might make 
sense to have response codes in PCP request messages that happen to be 
EAP response messages. But then it is wrong terminology for PCP response 
messages that contain EAP requests.

I'm not quite sure how to recommend clarifying this. The whole situation 
is set up for confusion of roles, so it does require considerable care 
to make clear. (*I* would be inclined to use separate fields for PCP 
response codes, EAP response codes, and for "sub-opcodes" of PCP 
opcodes. But you have thought about it more than I have.)

* Section 5.2:

The section title is inappropriate – the section is not about the 
Authentication Opcode, it is about information that is specific and 
common to all messages (requests and responses) containing the 
Authentication opcode. I suggest retitling this "Opcode-specific 
information of PCP Auth Messages".

Also, it says:

       ... A
       sequence number needs to be incremented on every new (non-
       retransmission) outgoing message in order to provide an ordering
       guarantee for PCP messages.

This could be clearer – stating that the sequence number to use is 
different for requests and responses. (And name which ones.)

* Section 5.3:

Is the following intended to be normative?

    If the PA-Server message does not carry the correct nonce,
    the message will be discarded silently.

If so it should use 2119 language. Perhaps this section isn’t the proper 
place for that normative language, but I don’t find it anywhere else.

* Section 5.4:

As in section 5.2, this would benefit from describing which sequence 
numbers are to be used.

* Section 5.11:

       ... The two indicator
       strings are to be considered equivalent by the client if they are
       an exact octet-for-octet match.

Is this intended to mean "if *and only if* they are an exact match"? Or 
is the client being given the option of accepting a looser match if it 
wishes? Right now that seems ambiguous, so it would be good to tighten 
up the language to be explicit about what you mean.

* Section 6.1:

In this section there are separate bullets for PA messages and Common 
PCP messages. Both talk about appending an "Authentication Tag Option". 
But there are different options for PA messages and common messages. 
ISTM the text here should reiterate that.

* Section 6.6:

This says that EAP lower layers indicate to EAP methods the MTU of the 
lower layer. But doesn’t *this* protocol provide the "lower layer" that 
EAP is running over? So isn’t this protocol responsible for determining 
the MTU of the layer *it* runs over, adjust that for the overhead it 
adds, and providing that to the EAP layer?

* Section 7:

All opcodes, result codes, and options currently registered with IANA 
have names formatted as upper case tokens that are legal C-language 
identifiers. (Using underscore rather than dash.) For consistency the 
new ones defined in this document ought to follow that form. The new 
response codes are close, but use dashes. The others are currently 
phrases. The new option names read more like descriptions. (Those may be 
appropriate in the Purpose field.) If this change is made, then the 
mnemonics can be used throughout the text. That would be clearer.

All of the options registered here are to be allocated in the 
mandatory-to-process range. Yet I *guess* that everything in this draft 
is optional for PCP. I guess that there is a backward compatibility 
requirement – that implementations PCP with this extension interoperate 
with basic 6887 implementations. Doesn’t that mean that all the things 
registered here should be optional to process by PCP (though still 
mandatory to process for those implementing this specification.)

It would be good to have numbered subsections for each distinct 
registry, and maybe for each distinct value, so that they show up in the 
TOC. It would also be nice to format these so that it is clear where one 
element ends and the next one starts. (Right now it is very hard to read.)