IPsec Working Group                                          R. Housley
Internet Draft                                         RSA Laboratories
expires in six months                                         July 2002


                 Using AES Counter Mode With IPsec ESP
                 <draft-ietf-ipsec-ciph-aes-ctr-00.txt>

Status of this Memo

  This document is an Internet-Draft and is in full conformance with all
  provisions of Section 10 of RFC2026.

  Internet-Drafts are working documents of the Internet Engineering Task
  Force (IETF), its areas, and its working groups.  Note that other
  groups may also distribute working documents as Internet-Drafts.

  Internet-Drafts are draft documents valid for a maximum of six months
  and may be updated, replaced, or obsoleted by other documents at any
  time.  It is inappropriate to use Internet-Drafts as reference
  material or to cite them other than as "work in progress."

  The list of current Internet-Drafts can be accessed at
  http://www.ietf.org/ietf/1id-abstracts.txt

  The list of Internet-Draft Shadow Directories can be accessed at
  http://www.ietf.org/shadow.html.

  This document is a submission to the IETF Internet Protocol Security
  (IPsec) Working Group. Please send comments on this document to the
  working group mailing list (ipsec@lists.tislabs.com).

  Distribution of this memo is unlimited.

Abstract

  This document describes the use of AES Counter Mode, with an explicit
  initialization vector, as an IPsec Encapsulating Security Payload
  confidentiality mechanism.












Housley                                                         [Page 1]


INTERNET DRAFT                                                 July 2002


1. Introduction

  The National Institute of Standards and Technology (NIST) recently
  selected the Advanced Encryption Standard (AES) [AES], also known as
  Rijndael.  The AES is a block cipher, and it can be used in many
  different modes.  This document describes the use of AES Counter Mode
  (AES-CTR), with an explicit initialization vector (IV), as an IPsec
  Encapsulating Security Payload (ESP) [ESP] confidentiality mechanism.

  This document does not provide an overview of IPsec.  However,
  information about how the various components of IPsec and the way in
  which they collectively provide security services is available in
  [ARCH] and [ROAD].

1.1. Conventions Used In This Document

  The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
  "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
  document are to be interpreted as described in [STDWORDS].

2. AES Block Cipher

  This section contains a brief description of the relevant
  characteristics of the AES block cipher.  Implementation requirements
  are also discussed.

2.1. Counter Mode

  NIST has defined five modes of operation for AES and other FIPS-
  approved block ciphers [MODES].  Each of these modes has different
  characteristics.  The five modes are: ECB (Electronic Code Book), CBC
  (Cipher Block Chaining), CFB (Cipher FeedBack), OFB (Output FeedBack),
  and CTR (Counter).

  In this specification, only AES-CTR is discussed.  This mode requires
  the encryptor to generate a unique per-packet value, and communicate
  this value to the decryptor.  This specification calls this per-packet
  value an initialization vector (IV).  The same IV and key combination
  MUST NOT be used more than once.  The encryptor can generate the IV in
  any manner that ensures uniqueness.  Common approaches to IV
  generation include incrementing a counter for each packet and linear
  feedback shift registers (LFSRs).

  AES Counter mode (AES-CTR) has many properties that make it an
  attractive encryption algorithm for in high-speed networking.  AES-CTR
  uses the AES block cipher to create a stream cipher.  It is easy to
  implement, and it is parallelizable.  It can take advantage of
  pipelining.  Further, it uses the only AES encrypt operation (for both



Housley                                                         [Page 2]


INTERNET DRAFT                                                 July 2002


  encryption and decryption), making AES-CTR implementations smaller
  than many other AES modes.  When used correctly, AES-CTR provides a
  high level of confidentiality.

  Unfortunately, AES-CTR is easy to use incorrectly.  Being a stream
  cipher, any reuse of the per-packet value, called the IV, with the
  same key is catastrophic.  An IV collision immediately leaks
  information about the plaintext in both packets.  For this reason, it
  is inappropriate to use this mode of operation with statically
  configured keys.  Extraordinary measures would be needed to prevent
  reuse of an IV value with the static key across power cycles.  To be
  safe, implementations MUST use fresh keys with AES-CTR.  The Internet
  Key Exchange (IKE) [IKE] protocol can be used to establish fresh keys.

  With AES-CTR, it is trivial to use a valid ciphertext to forge other
  (valid to the decryptor) ciphertexts.  Thus, it is equally
  catastrophic to use AES-CTR without a companion authentication
  function.  To be safe, implementations MUST use AES-CTR in conjunction
  with an authentication function, such as HMAC-SHA-1-96 [HMAC-SHA].

  To encrypt a payload with AES-CTR, the encryptor partitions the
  plaintext, PT, into 128-bit blocks.  The final block need not be full
  128 bits.

      PT = PT[1] PT[2] ... PT[n]

   Each block of PT is then XORed with a block of the key stream to
   generate the ciphertext, CT.  The AES encryption of each counter
   block results in 128 bits of key stream.  Part of the 128-bit counter
   block is set to the per-packet IV value, and the least significant 32
   bits of the counter block are initially set to zero.  This counter
   value is incremented by one to generate subsequent counter blocks,
   each resulting in another 128 bits of key stream.  The encryption of
   n plaintext blocks can be summarized as:

      CTRBLK := IV || ZERO
      FOR i := 1 to n-1 DO
        CT[i] := PT[i] XOR AES(CTRBLK)
        CTRBLK := CTRBLK + 1
      END
      CT[n] := PT[n] XOR TRUNC(AES(CTRBLK))

   The TRUNC() function truncates the output of the AES encrypt
   operation to the same length as the final plaintext block, returning
   the most significant bits.






Housley                                                         [Page 3]


INTERNET DRAFT                                                 July 2002


   Decryption is similar.  The decryption of n ciphertext blocks can be
   summarized as:

      CTRBLK := IV || ZERO
      FOR i := 1 to n-1 DO
        PT[i] := CT[i] XOR AES(CTRBLK)
        CTRBLK := CTRBLK + 1
      END
      PT[n] := CT[n] XOR TRUNC(AES(CTRBLK))

2.2. Key Size and Rounds

   AES supports three key sizes: 128 bits, 192 bits, and 256 bits.  The
   default key size is 128 bits, and all implementations MUST support
   this key size.  Implementations MAY also support key sizes of 192
   bits and 256 bits.

   AES uses a different number of rounds for each of the defined key
   sizes.  When a 128-bit key is used, implementations MUST use 10
   rounds.  When a 192-bit key is used, implementations MUST use 12
   rounds.  When a 256-bit key is used, implementations MUST use 14
   rounds.

2.3. Block Size

   The AES has a block size of 128 bits (16 octets).  As such, when
   using AES-CTR, each AES encrypt operation generates 128 bits of key
   stream.  AES-CTR encryption is the XOR of the key stream with the
   plaintext.  AES-CTR decryption is the XOR of the key stream with the
   ciphertext.  If the generated key stream is longer than the plaintext
   or ciphertext, the extra key stream bits are simply discarded.  For
   this reason, AES-CTR does not require the plaintext to be padded to a
   multiple of the block size.  However, to provide limited traffic flow
   confidentiality, padding MAY be included, as specified in [ESP].

3. ESP Payload

   The ESP payload is comprised of the IV followed by the ciphertext.
   The payload field, as defined in [ESP], is structured as shown in
   Figure 1.











Housley                                                         [Page 4]


INTERNET DRAFT                                                 July 2002


       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Initialization Vector                     |
      |                            (8 octets)                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      ~                  Encrypted Payload (variable)                 ~
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      ~                 Authentication Data (variable)                ~
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

               Figure 1.  ESP Payload Encrypted with AES-CTR

3.1. Initialization Vector

   The AES-CTR IV field MUST be eight octets.  The IV MUST be chosen by
   the encryptor in a manner that ensures that the same IV value is used
   only once for a given key.  The encryptor can generate the IV in any
   manner that ensures uniqueness.  Common approaches to IV generation
   include incrementing a counter for each packet and linear feedback
   shift registers (LFSRs).

   Including the IV in each packet ensures that the decryptor can
   generate the key stream needed for decryption, even when some
   datagrams are lost or reordered.

3.2. Encrypted Payload

   The encrypted payload contains the ciphertext.

   AES-CTR mode does not require plaintext padding.  However, ESP does
   require padding to 32-bit word-align the authentication data.  The
   padding, Pad Length, and the Next Header MUST be concatenated with
   the plaintext before performing encryption, as described in [ESP].

3.3. Authentication Data

   Since it is trivial to construct a forgery AES-CTR ciphertext from a
   valid AES-CTR ciphertext, AES-CTR implementations MUST employ a non-
   NULL ESP authentication method.  HMAC-SHA-1-96 [HMAC-SHA] is a likely
   choice.






Housley                                                         [Page 5]


INTERNET DRAFT                                                 July 2002


4. Counter Block Format

   Each packet conveys the IV that is necessary to construct the
   sequence of counter blocks used to generate the key stream necessary
   to decrypt the payload.  The AES counter block cipher block is 128
   bits.  Figure 2 shows the format of the counter block.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     Flags     |             Truncated SPI                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Initialization Vector                     |
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         Block Counter                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                      Figure 2.  Counter Block Format

   The components of the counter block are as follows:

      Flags
         The Flags field is 8 bits.  It MUST be set to zero.  The Flags
         field provides compatibility with CCM mode [CCM].

      Truncated SPI
         The truncated SPI field is 24 bits.  As the name implies, it
         contains the least significant 24 bits of the ESP SPI.

      Initialization Vector
         The IV field is 64 bits.  As described in section 3, the IV
         MUST be chosen by the encryptor in a manner that ensures that
         the same IV value is used only once for a given key.

      Block Counter
         The block counter field is the least significant 32 bits of the
         counter block.  The block counter begins with the value of
         zero, and it is incremented to generate subsequent portions of
         the key stream.  The block counter is a 32-bit big-Endian
         integer value.

   The first 128-bit block of the packet plaintext is encrypted by
   XORing the plaintext block with the AES encryption of the counter
   block (with the block counter set to zero), the second is encrypted
   by XORing the second block of plaintext with AES encryption of the
   incremented counter block (with the block counter set to one), and so
   on.



Housley                                                         [Page 6]


INTERNET DRAFT                                                 July 2002


   This construction permits each packet to consist of up to:

         2^32 blocks  =  4,294,967,296 blocks
                      = 68,719,476,736 octets

   This construction provides more key stream for each packet than is
   needed to handle any IPv6 Jumbogram.


5. Test Vectors

   To be supplied.

6. Security Considerations

   When used properly, AES-CTR mode provides strong confidentiality.
   Bellare, Desai, Jokipii, Rogaway show in [BDJR] that the privacy
   guarantees provided by counter mode are at least as strong as those
   for CBC mode when using the same block cipher.

   Unfortunately, it is very easy to misuse this counter mode.  If a
   counter value is ever used for more that one packet with the same
   key, then the same key stream will be used to encrypt both packets,
   and the confidentiality guarantees are voided.

   What happens if the encryptor XORs the same key stream with two
   different plaintexts?  Suppose two plaintext byte sequences P1, P2,
   P3 and Q1, Q2, Q3 are both encrypted with key stream K1, K2, K3.  The
   two corresponding ciphertexts are:

      (P1 XOR K1), (P2 XOR K2), (P3 XOR K3)

      (Q1 XOR K1), (Q2 XOR K2), (Q3 XOR K3)

   If both of these two ciphertext streams are exposed to an attacker,
   then a catastrophic failure of confidentiality results, since:

      (P1 XOR K1) XOR (Q1 XOR K1) = P1 XOR Q1
      (P2 XOR K2) XOR (Q2 XOR K2) = P2 XOR Q2
      (P3 XOR K3) XOR (Q3 XOR K3) = P3 XOR Q3

   Once the attacker obtains the two plaintexts XORed together, it is
   relatively straightforward to separate them.  Thus, using any stream
   cipher, including AES-CTR, to encrypt two plaintexts under the same
   key stream leaks the plaintext.

   Therefore, stream ciphers, including AES-CTR, should not be used with
   statically configured keys.  It is inappropriate to use this m AES-



Housley                                                         [Page 7]


INTERNET DRAFT                                                 July 2002


   CTR with statically configured keys.  Extraordinary measures would be
   needed to prevent reuse of a counter block value with the static key
   across power cycles.  To be safe, implementations MUST use fresh keys
   with AES-CTR.  The Internet Key Exchange (IKE) [IKE] protocol can be
   used to establish fresh keys.

   When IKE is used to establish fresh keys between two peer entities,
   separate keys are established for the two traffic flows.  If a
   different mechanism is used to establish fresh keys, one that
   establishes only a single key to encrypt packets, then there is a
   high probability that the peers will select the same IV values for
   some packets.  Thus, to avoid counter block collisions, ESP
   implementations that permit use of the same key for encrypting and
   decrypting packets with the same peer MUST ensure that the two peers
   assign different SPI values to the security association (SA).
   Further, since the counter block only contains the least significant
   24 bits of the SPI, such implementations MUST ensure that the two SPI
   values differ in the least significant bits.

   Data forgery is trivial with CTR mode.  The demonstration of this
   attack is very similar to discussion above.  If a known plaintext
   byte sequence P1, P2, P3 is encrypted with key stream K1, K2, K3,
   then the attacker can replace the plaintext with one of his own
   choosing.  The ciphertext is:

      (P1 XOR K1), (P2 XOR K2), (P3 XOR K3)

   The attacker simply XORs a selected sequence Q1, Q2, Q3 with the
   ciphertext to obtain:

      (Q1 XOR (P1 XOR K1)), (Q2 XOR (P2 XOR K2)), (Q3 XOR (P3 XOR K3))

   Which is the same as:

      ((Q1 XOR P1) XOR K1), ((Q2 XOR P2) XOR K2), ((Q3 XOR P3) XOR K3)

   Decryption of the attacker-generated ciphertext will yield exactly
   what the attacker intended:

      (Q1 XOR P1), (Q2 XOR P2), (Q3 XOR P3)

   Accordingly, ESP implementations that MUST NOT allow the use of AES-
   CTR without ESP authentication.

   Additionally, AES with a 128-bit key is vulnerable to the birthday
   attack after 2^64 blocks are encrypted with a single key, regardless
   of the mode used.  Since ESP with Enhanced Sequence Numbers allows
   for up to 2^64 packets in a single security association (SA), there



Housley                                                         [Page 8]


INTERNET DRAFT                                                 July 2002


   is real potential for more than 2^64 blocks to be encrypted with one
   key.  Implementations SHOULD generate a fresh key before 2^64 blocks
   are encrypted with the same key, or implementations SHOULD make use
   of the longer AES key sizes.  Note that ESP with 32-bit Sequence
   Numbers will not exceed 2^64 blocks even if all of the packets are
   maximum-length Jumbograms.

7. Design Rationale

   In the development of this specification, the use of the ESP sequence
   number field instead of an explicit IV field was considered.  This
   section documents the rationale for the selection of an explicit IV.
   This selection is not a cryptographic security issue, as either
   approach will prevent counter block collisions.

   The use of the explicit IV does not dictate the manner that the
   encryptor uses to assign the per-packet value in the counter block.
   This is desirable for several reasons.

      1.  Only the encryptor can ensure that the value is not used for
      more than one packet, so there is no advantage to selecting a
      mechanism that allows the decryptor to determine whether counter
      block values collide.  Damage from the collision is done, whether
      the decryptor detects it or not.

      2.  Allows adders, LFSRs, and any other technique that meets the
      time budget of the encryptor, so long as the technique results in
      a unique value for each packet.  Adders are simple and
      straightforward to implement, but due to carries, they do not
      execute in constant time.  LSFRs offer an alternative that
      executes in constant time.

      3.  Complexity is in control of the implementer.  Further, the
      decision made by the implementer of the encryptor does not make
      the decryptor more (or less) complex.

      4.  The assignment of the per-packet counter block value needs to
      be inside the assurance boundary.  Some implementations assign the
      sequence number inside the assurance boundary, but others do not.
      A sequence number collision does not have the dire consequences,
      but, as described in section 6, a collision in counter block
      values has disastrous consequences.

      5.  Coupling with the sequence number is possible in those
      architectures where the sequence number assignment is performed in
      the assurance boundary.  In this situation, the sequence number
      and the IV field will contain the same value.




Housley                                                         [Page 9]


INTERNET DRAFT                                                 July 2002


      6.  Decoupling from the sequence number is possible in those
      architectures where the sequence number assignment is performed
      outside the assurance boundary.

   The use of an explicit IV field directly follows from the decoupling
   of the sequence number and the per-packet counter block value.  The
   additional overhead (64 bits for the IV field) is acceptable.  This
   overhead is significantly less overhead associated with Cipher Block
   Chaining (CBC) mode.  As normally employed, CBC requires a full block
   for the IV and, on average, half of a block for padding.  AES-CTR
   with an explicit IV has about one-third of the overhead as AES-CBC,
   and the overhead is constant for each packet.

8. IANA Considerations

   IANA has assigned three ESP transform numbers for use with AES
   Counter Mode with an explicit IV, one for each AES key size:

      <TBD1> for AES-CTR with a 128 bit key;
      <TBD2> for AES-CTR with a 192 bit key; and
      <TBD3> for AES-CTR with a 256 bit key.

9. Acknowledgements

   This document is the result of extensive discussions and compromises.
   While not all of the participants are completely satisfied with the
   outcome, the document is better for their contributions.  The author
   thanks the members of the IPsec working group, with special mention
   of the efforts of (in alphabetical order): Steve Bellovin, Niels
   Ferguson, Steve Kent, David McGrew, Robert Moskowitz, Jesse Walker,
   and Doug Whiting.

10. References

   This section provides normative and informative references.

10.1. Normative References

   [AES]       NIST, FIPS PUB 197, "Advanced Encryption Standard
               (AES)," November 2001.

   [ESP]       Kent, S. and R. Atkinson, "IP Encapsulating Security
               Payload (ESP)," RFC 2406, November 1998.

   [MODES]     Dworkin, M., "Recommendation for Block Cipher Modes
               of Operation: Methods and Techniques," NIST Special
               Publication 800-38A, December 2001.




Housley                                                        [Page 10]


INTERNET DRAFT                                                 July 2002


   [STDWORDS]  Bradner, S., "Key words for use in RFCs to Indicate
               Requirement Levels," RFC 2119, March 1997.

10.2. Informative References

   [ARCH]      Kent, S. and R. Atkinson, "Security Architecture for
               the Internet Protocol," RFC 2401, November 1998.

   [BDJR]      Bellare, M, Desai, A., Jokipii, E., and P. Rogaway,
               "A Concrete Security Treatment of Symmetric Encryption:
               Analysis of the DES Modes of Operation", Proceedings
               38th Annual Symposium on Foundations of Computer
               Science, 1997.

   [CCM]       Whiting, D., Housley, R. and N. Ferguson, "AES
               Encryption & Authentication Using CTR Mode & CBC-MAC,"
               IEEE P802.11 doc 02/001r2, May 2002.

   [HMAC-SHA]  Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96
               within ESP and AH," RFC 2404, November 1998.

   [IKE]       Harkins, D. and D. Carrel, "The Internet Key Exchange
               (IKE)," RFC 2409, November 1998.

   [ROAD]      Thayer, R., N. Doraswamy and R. Glenn, "IP Security
               Document Roadmap," RFC 2411, November 1998.

11. Author's Address

   Russell Housley
   RSA Laboratories
   918 Spring Knoll Drive
   Herndon, VA 20170
   USA
   rhousley@rsasecurity.com
















Housley                                                        [Page 11]


INTERNET DRAFT                                                 July 2002


12. Full Copyright Statement

   Copyright (C) The Internet Society 2002.  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
























Housley                                                        [Page 12]