Skip to main content

A Remote Direct Memory Access Protocol Specification
draft-ietf-rddp-rdmap-07

The information below is for an old version of the document that is already published as an RFC.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 5040.
Authors Renato J. Recio , Paul R. Culley , Dave Garcia , Bernard Metzler , Jeff Hilland
Last updated 2015-10-14 (Latest revision 2006-09-10)
RFC stream Internet Engineering Task Force (IETF)
Intended RFC status Proposed Standard
Formats
Reviews
Additional resources Mailing list discussion
Stream WG state (None)
Document shepherd (None)
IESG IESG state Became RFC 5040 (Proposed Standard)
Action Holders
(None)
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD Lars Eggert
Send notices to hemal.shah@intel.com, jpink@microsoft.com
draft-ietf-rddp-rdmap-07
Remote Direct Data Placement Work Group  R. Recio  
   INTERNET DRAFT                             IBM Corporation 
   draft-ietf-rddp-rdmap-07.txt             P. Culley 
                                              Hewlett-Packard Company 
                                            D. Garcia 
                                              Hewlett-Packard Company 
                                            J. Hilland 
                                              Hewlett-Packard Company 
                                            B. Metzler 
                                              IBM Corporation 
                                             
   Expires: February, 2007                   September 8, 2006 
        

      A Remote Direct Memory Access Protocol Specification  

      Status of this Memo 

      By submitting this Internet-Draft, each author represents that any 
      applicable patent or other IPR claims of which he or she is aware 
      have been or will be disclosed, and any of which he or she becomes 
      aware will be disclosed, in accordance with Section 6 of BCP 79. 

      Internet-Drafts are working documents of the Internet Engineering 
      Task Force (IETF), its areas, and its working groups.  Note that 
      other groups may also distribute working documents as Internet-
      Drafts. 

      Internet-Drafts are draft documents valid for a maximum of six 
      months and may be updated, replaced, or obsoleted by other 
      documents at any time.  It is inappropriate to use Internet-Drafts 
      as reference material or to cite them other than as "work in 
      progress." 

      The list of current Internet-Drafts can be accessed at 
      http://www.ietf.org/1id-abstracts.html The list of Internet-Draft 
      Shadow Directories can be accessed at 
      http://www.ietf.org/shadow.html. 

    
    
                          Expires February, 2007               [Page 1] 


   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      Abstract 

      This document defines a Remote Direct Memory Access Protocol 
      (RDMAP) that operates over the Direct Data Placement Protocol (DDP 
      protocol).  RDMAP provides read and write services directly to 
      applications and enables data to be transferred directly into 
      Upper Layer Protocol (ULP) buffers without intermediate data 
      copies. It also enables a kernel bypass implementation. 

    
    
                           Expires January, 2007                [Page 2] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      Table of Contents 

      1    Introduction................................................6 
      1.1  Architectural Goals.........................................6 
      1.2  Protocol Overview...........................................7 
      1.3  RDMAP Layering.............................................10 
      1.4  Specification Changes from the Last Version................11 
      2    Glossary...................................................14 
      2.1  General....................................................14 
      2.2  LLP........................................................16 
      2.3  Direct Data Placement (DDP)................................17 
      2.4  Remote Direct Memory Access (RDMA).........................19 
      3    ULP and Transport Attributes...............................22 
      3.1  Transport Requirements & Assumptions.......................22 
      3.2  RDMAP Interactions with the ULP............................23 
      4    Header Format..............................................27 
      4.1  RDMAP Control and Invalidate STag Field....................27 
      4.2  RDMA Message Definitions...................................30 
      4.3  RDMA Write Header..........................................31 
      4.4  RDMA Read Request Header...................................32 
      4.5  RDMA Read Response Header..................................34 
      4.6  Send Header and Send with Solicited Event Header...........34 
      4.7  Send with Invalidate Header and Send with SE and Invalidate 
      Header...........................................................34 
      4.8  Terminate Header...........................................34 
      5    Data Transfer..............................................41 
      5.1  RDMA Write Message.........................................41 
      5.2  RDMA Read Operation........................................42 
      5.2.1  RDMA Read Request Message.................................42 
      5.2.2  RDMA Read Response Message................................43 
      5.3  Send Message Type..........................................44 
      5.4  Terminate Message..........................................46 
      5.5  Ordering and Completions...................................47 
      6    RDMAP Stream Management....................................51 
      6.1  Stream Initialization......................................51 
      6.2  Stream Teardown............................................52 
      6.2.1  RDMAP Abortive Termination................................52 
      7    RDMAP Error Management.....................................54 
      7.1  RDMAP Error Surfacing......................................54 
      7.2  Errors Detected at the Remote Peer on Incoming RDMA Messages55 
      8    Security Considerations....................................57 
      8.1  Summary of RDMAP specific Security Requirements............57 
      8.1.1  RDMAP (RNIC) Requirements.................................57 
    
    
                           Expires January, 2007                [Page 3] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      8.1.2  Privileged Resource Manager Requirements..................59 
      8.2  Security Services for RDMAP................................60 
      8.2.1  Available Security Services...............................60 
      8.2.2  Requirements for IPsec Services for RDMAP.................61 
      9    IANA.......................................................64 
      10   References.................................................65 
      10.1   Normative References......................................65 
      10.2   Informative References....................................66 
      11   Appendix...................................................67 
      11.1   DDP Segment Formats for RDMA Messages.....................67 
      11.1.1  DDP Segment for RDMA Write..............................67 
      11.1.2  DDP Segment for RDMA Read Request.......................67 
      11.1.3  DDP Segment for RDMA Read Response......................69 
      11.1.4  DDP Segment for Send and Send with Solicited Event......69 
      11.1.5  DDP Segment for Send with Invalidate and Send with SE and 
      Invalidate.......................................................70 
      11.1.6  DDP Segment for Terminate...............................71 
      11.2   Ordering and Completion Table.............................71 
      12   Author's Address...........................................75 
      13   Contributors...............................................76 
      14   Intellectual Property Statement............................80 
      15   Full Copyright Statement...................................81 
       

      Table of Figures 

      Figure 1 RDMAP Layering..........................................10 
      Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP11 
      Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields..28 
      Figure 4 RDMA Usage of DDP Fields................................29 
      Figure 5 RDMA Message Definitions................................31 
      Figure 6 RDMA Read Request Header Format.........................32 
      Figure 7 Terminate Header Format.................................35 
      Figure 8 Terminate Control Field.................................35 
      Figure 9 Terminate Control Field Values..........................38 
      Figure 10 Error Type to RDMA Message Mapping.....................40 
      Figure 11 RDMA Write, DDP Segment format.........................67 
      Figure 12 RDMA Read Request, DDP Segment format..................68 
      Figure 13 RDMA Read Response, DDP Segment format.................69 
      Figure 14 Send and Send with Solicited Event, DDP Segment format.70 
      Figure 15 Send with Invalidate and Send with SE and Invalidate, 
      DDP Segment......................................................70 
      Figure 16 Terminate, DDP Segment format..........................71 
    
    
                           Expires January, 2007                [Page 4] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      Figure 17 Operation Ordering.....................................74 
          

    
    
                           Expires January, 2007                [Page 5] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    1  Introduction 

      Today, communications over TCP/IP typically require copy 
      operations, which add latency and consume significant CPU and 
      memory resources.  The Remote Direct Memory Access Protocol 
      (RDMAP) enables removal of data copy operations and enables 
      reduction in latencies by allowing a local application to read or 
      write data on a remote computer's memory with minimal demands on 
      memory bus bandwidth and CPU processing overhead, while preserving 
      memory protection semantics.  

      RDMAP is layered on top of Direct Data Placement (DDP) and uses 
      the two Buffer Models available from DDP. DDP-related terminology 
      is discussed in Section 2.3. As RDMAP builds on DDP the reader is 
      advised to become familiar with [DDP]. 

      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 
      NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" 
      in this document are to be interpreted as described in RFC 2119." 

    
       

   1.1  Architectural Goals 

      RDMAP has been designed with the following high-level 
      architectural goals: 

      *  Provide a data transfer operation that allows a Local Peer to 
         transfer up to 2^32 - 1 octets directly into a previously 
         advertised buffer (i.e., Tagged buffer) located at a Remote 
         Peer without requiring a copy operation. This is referred to as 
         the RDMA Write data transfer operation. 

      *  Provide a data transfer operation that allows a Local Peer to 
         retrieve up to 2^32 - 1 octets directly from a previously 
         advertised buffer (i.e., Tagged buffer) located at a Remote 
         Peer without requiring a copy operation. This is referred to as 
         the RDMA Read data transfer operation. 

      *  Provide a data transfer operation that allows a Local Peer to 
         send up to 2^32 - 1 octets directly into a buffer located at a 
         Remote Peer that has not been explicitly advertised. This is 
    
    
                           Expires January, 2007                [Page 6] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
         referred to as the Send (Send with Invalidate, Send with 
         Solicited Event, and Send with Solicited Event and Invalidate) 
         data transfer operation. 

      *  Enable the local ULP to use the Send Operation Type (includes 
         Send, Send with Invalidate, Send with Solicited Event, and Send 
         with Solicited Event and Invalidate) to signal to the remote 
         ULP the Completion of all previous Messages initiated by the 
         local ULP. 

      *  Provide for all Operations on a single RDMAP Stream to be 
         reliably transmitted in the order that they were submitted.  

      *  Provide RDMAP capabilities independently for each Stream when 
         the LLP supports multiple data Streams within an LLP 
         connection. 

   1.2  Protocol Overview 

      RDMAP provides seven data transfer operations. Except for the RDMA 
      Read operation, each operation generates exactly one RDMA Message. 
      Following is a brief overview of the RDMA Operations and RDMA 
      Messages: 

      1.  Send - A Send operation uses a Send Message to transfer data 
          from the Data Source into a buffer that has not been 
          explicitly Advertised by the Data Sink. The Send Message uses 
          the DDP Untagged Buffer Model to transfer the ULP Message into 
          the Data Sink's Untagged Buffer. 

      2.  Send with Invalidate - A Send with Invalidate operation uses a 
          Send with Invalidate Message to transfer data from the Data 
          Source into a buffer that has not been explicitly Advertised 
          by the Data Sink. The Send with Invalidate Message includes 
          all functionality of the Send Message, with one addition: an 
          STag field is included in the Send with Invalidate Message and 
          after the message has been Placed and Delivered at the Data 
          Sink the remote peer's buffer identified by the STag can no 
          longer be accessed remotely until the remote peer's ULP re-
          enables access and Advertises the buffer. 

      3.  Send with Solicited Event (Send with SE) - A Send with 
          Solicited Event operation uses a Send with Solicited Event 
    
    
                           Expires January, 2007                [Page 7] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          Message to transfer data from the Data Source into an Untagged 
          Buffer at the Data Sink. The Send with Solicited Event Message 
          is similar to the Send Message, with one addition: when the 
          Send with Solicited Event Message has been Placed and 
          Delivered, an Event may be generated at the recipient, if the 
          recipient is configured to generate such an Event. 

      4.  Send with Solicited Event and Invalidate (Send with SE and 
          Invalidate) - A Send with Solicited Event and Invalidate 
          operation uses a Send with Solicited Event and Invalidate 
          Message to transfer data from the Data Source into a buffer 
          that has not been explicitly Advertised by the Data Sink. The 
          Send with Solicited Event and Invalidate Message is similar to 
          the Send with Invalidate Message, with one addition: when the 
          Send with Solicited Event and Invalidate Message has been 
          Placed and Delivered, an Event may be generated at the 
          recipient, if the recipient is configured to generate such an 
          Event. 

      5.  Remote Direct Memory Access Write - An RDMA Write operation 
          uses an RDMA Write Message to transfer data from the Data 
          Source to a previously advertised buffer at the Data Sink.  

          The ULP at the Remote Peer, which in this case is the Data 
          Sink, enables the Data Sink Tagged Buffer for access and 
          Advertises the buffer's size (length), location (Tagged 
          Offset), and Steering Tag (STag) to the Data Source through a 
          ULP specific mechanism. The ULP at the Local Peer, which in 
          this case is the Data Source, initiates the RDMA Write 
          operation. The RDMA Write Message uses the DDP Tagged Buffer 
          Model to transfer the ULP Message into the Data Sink's Tagged 
          Buffer. Note: the STag associated with the Tagged Buffer 
          remains valid until the ULP at the Remote Peer invalidates it 
          or the ULP at the Local Peer invalidates it through a Send 
          with Invalidate or Send with Solicited Event and Invalidate. 

      6.  Remote Direct Memory Access Read - The RDMA Read operation 
          transfers data to a Tagged Buffer at the Local Peer, which in 
          this case is the Data Sink, from a Tagged Buffer at the Remote 
          Peer, which in this case is the Data Source. The ULP at the 
          Data Source enables the Data Source Tagged Buffer for access 
          and Advertises the buffer's size (length), location (Tagged 
          Offset), and Steering Tag (STag) to the Data Sink through a 
    
    
                           Expires January, 2007                [Page 8] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          ULP specific mechanism. The ULP at the Data Sink enables the 
          Data Sink Tagged Buffer for access and initiates the RDMA Read 
          operation. The RDMA Read operation consists of a single RDMA 
          Read Request Message and a single RDMA Read Response Message, 
          and the latter may be segmented into multiple DDP Segments.  

          The RDMA Read Request Message uses the DDP Untagged Buffer 
          Model to Deliver the STag, starting Tagged Offset and length 
          for both the Data Source and Data Sink Tagged Buffers to the 
          remote peer's RDMA Read Request Queue.  

          The RDMA Read Response Message uses the DDP Tagged Buffer 
          Model to Deliver the Data Source's Tagged Buffer to the Data 
          Sink, without any involvement from the ULP at the Data Source. 

          Note: the Data Source STag associated with the Tagged Buffer 
          remains valid until the ULP at the Data Source invalidates it 
          or the ULP at the Data Sink invalidates it through a Send with 
          Invalidate or Send with Solicited Event and Invalidate. The 
          Data Sink STag associated with the Tagged Buffer remains valid 
          until the ULP at the Data Sink invalidates it. 

      7.  Terminate - A Terminate operation uses a Terminate Message to 
          transfer to the Remote Peer information associated with an 
          error that occurred at the Local Peer. The Terminate Message 
          uses the DDP Untagged Buffer Model to transfer the Message 
          into the Data Sink's Untagged Buffer. 

       

       

       

       

       

       

       

    
    
                           Expires January, 2007                [Page 9] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   1.3  RDMAP Layering 

      RDMAP is dependent on DDP, subject to the requirements defined in 
      section 3.1 Transport Requirements & Assumptions.  Figure 1 RDMAP 
      Layering depicts the relationship between Upper Layer Protocols 
      (ULPs), RDMAP, DDP protocol, the framing layer, and the transport. 
      For LLP protocol definitions of each LLP, see [MPA], [TCP], and 
      [SCTP].  

                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                    |                                     | 
                    |     Upper Layer Protocol (ULP)      | 
                    |                                     | 
                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                    |                                     | 
                    |              RDMAP                  | 
                    |                                     | 
                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                    |                                     | 
                    |           DDP protocol              | 
                    |                                     | 
                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                    |                 |                   | 
                    |       MPA       |                   | 
                    |                 |                   | 
                    +-+-+-+-+-+-+-+-+-+       SCTP        | 
                    |                 |                   | 
                    |       TCP       |                   | 
                    |                 |                   | 
                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 1 RDMAP Layering 

      If RDMAP is layered over DDP/MPA/TCP, then the respective headers 
      and ULP Payload are arranged as follows (Note: For clarity, MPA 
      header and CRC fields are included but MPA markers are not shown): 

    
    
                           Expires January, 2007               [Page 10] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
        0                   1                   2                   3 
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       //                           TCP Header                        // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |         MPA Header            |                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               + 
       |                                                               | 
       //                        DDP Header                           // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       //                        RDMA Header                          // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       //                        ULP Payload                          // 
       //                 (shown with no pad bytes)                   // 
       //                                                             // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                           MPA CRC                             | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP 

   1.4  Specification Changes from the Last Version 

      This section is to be removed before RFC publication. 

      The following major changes (vs typos) were made to the -06 and -
      07 version: 

      *  Incorporated comments from Transport Area Directors and the 
         Remote Direct Data Placement Working Group chair. 

       

      The following major changes (vs typos) were made to the -05 
      version: 

    
    
                           Expires January, 2007               [Page 11] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  To pass the IETF checklist tool, modified heading of Security 
         Section 8 to "Security" and added "Security Considerations" 
         below it. 

      *  Added IANA Section 9 and to pass the IETF checklist tool added 
         "IANA Considerations" line below Section 9 header. 

      *  Added Intellectual Property Statement Section 14 and IPR 
         Disclosure Acknowledgement Section 15. 

      *  Added Disclaimer Section 16. 

      *  Section 6.8 - Acknowledged that the Reserved field size for the 
         Terminate Message is 13 bits. The fix was made to the -04 
         version, but was not listed in this section. 

      *  Rewrite of the "Security" section to refer to Security document 
         rather than summarize. 

      *  Update to the "Contributors" section. 

      *  Changed boilerplate reference form 3667 to 3979. 

      *  Removed references to company names in the disclaimer section. 

      *  Added "Key Words" Disclaimer to the Introduction. 

       

      The following major changes (vs typos) were made to the -04 
      version: 

      *  Section 10 - Expanded IPsec requirements sentence in section 
         10.3.2 to say what is required in addition to cross-referencing 
         RFC 3723. 

      *  Section 6.8 - Fixed text after Figure 9 to reflect the correct 
         size (13 bits) of the Reserved field in the Terminate Message. 

      The following major changes (vs typos) were made to the -03 
      version: 

    
    
                           Expires January, 2007               [Page 12] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  Section 6.1 - Added normative text describing downward 
         compatibility with version 0. 

      *  Section 6.8 - Changed the description of the reserved field 
         size to match the size in the figure, which is 13 bits. 

      *  Section 10 - Aligned security section closely to [RDMASEC] and 
         added normative text for security requirements. 

      The following major changes (vs typos) were made to the -02 
      version: 

      *  Section 6.8 - Explicitly defined the bit numbers for the three 
         header control bits. 

      *  Section 8.1 - Stated the typical Stream initialization to be: 
         RDMA mode is entered some time after the LLP Stream is 
         initialized. 

      *  Section 10 - Update reference to security document. 

      *  Section 10 - Fixed Send with Solicited Event and Invalidate 
         reference. 

      *  Section 12.1 - MPA and DDP references were changed to reflect 
         the released specifications and accurate titles. 

      *  Section 12.1 - Reference for RDMA Protocol Verbs was changed to 
         reflect the released specification and accurate title. 

    
    
                           Expires January, 2007               [Page 13] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    2  Glossary 

   2.1 General 

      Advertisement (Advertised, Advertise, Advertisements, Advertises) 
          - the act of informing a Remote Peer that a local RDMA Buffer 
          is available to it. A Node makes available an RDMA Buffer for 
          incoming RDMA Read or RDMA Write access by informing its 
          RDMA/DDP peer of the Tagged Buffer identifiers (STag, base 
          address, and buffer length). This Advertisement of Tagged 
          Buffer information is not defined by RDMA/DDP and is left to 
          the ULP. A typical method would be for the Local Peer to embed 
          the Tagged Buffer's Steering Tag, base address, and length in 
          a Send Message destined for the Remote Peer. 

      Completion - Refer to "RDMA Completion" in Section 2.4. 

      Completed - See "RDMA Completion" in Section 2.4. 

      Complete - See "RDMA Completion" in Section 2.4. 

      Completes - See "RDMA Completion" in Section 2.4. 

      Data Sink - The peer receiving a data payload. Note that the Data 
          Sink can be required to both send and receive RDMA/DDP 
          Messages to transfer a data payload. 

      Data Source - The peer sending a data payload. Note that the Data 
          Source can be required to both send and receive RDMA/DDP 
          Messages to transfer a data payload. 

      Data Delivery (Delivery, Delivered, Delivers) - Delivery is 
          defined as the process of informing the ULP or consumer that a 
          particular Message is available for use.  This is specifically 
          different from "Placement", which may generally occur in any 
          order, while the order of "Delivery" is strictly defined. See 
          "Data Placement" in Section 2.3. 

      Delivery - See Data Delivery in Section 2.1. 

      Delivered - See Data Delivery in Section 2.1. 

      Delivers - See Data Delivery in Section 2.1. 
    
    
                           Expires January, 2007               [Page 14] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      Fabric - The collection of links, switches, and routers that 
          connect a set of Nodes with RDMA/DDP protocol implementations. 

      Fence (Fenced, Fences) - To block the current RDMA Operation from 
          executing until prior RDMA Operations have Completed.  

      iWARP - A suite of wire protocols comprised of RDMAP, DDP, and 
          MPA. The iWARP protocol suite may be layered above TCP, SCTP, 
          or other transport protocols.  

      Local Peer - The RDMA/DDP protocol implementation on the local end 
          of the connection. Used to refer to the local entity when 
          describing a protocol exchange or other interaction between 
          two Nodes. 

      Node - A computing device attached to one or more links of a 
          Fabric (network). A Node in this context does not refer to a 
          specific application or protocol instantiation running on the 
          computer. A Node may consist of one or more RNICs installed in 
          a host computer. 

      Placement - See "Data Placement" in Section 2.3 

      Placed - See "Data Placement" in Section 2.3 

      Places - See "Data Placement" in Section 2.3 

      Remote Peer - The RDMA/DDP protocol implementation on the opposite 
          end of the connection. Used to refer to the remote entity when 
          describing protocol exchanges or other interactions between 
          two Nodes. 

      RNIC - RDMA Network Interface Controller. In this context, this 
          would be a network I/O adapter or embedded controller with 
          iWARP and Verbs functionality. 

      RNIC Interface (RI) - The presentation of the RNIC to the Verbs 
          Consumer as implemented through the combination of the RNIC 
          and the RNIC driver. 

      Termination - See "RDMAP Abortive Termination" in Section 2.4. 

      Terminated - See "RDMAP Abortive Termination" in Section 2.4. 
    
    
                           Expires January, 2007               [Page 15] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    

      Terminate - See "RDMAP Abortive Termination" in Section 2.4 

      Terminates - See "RDMAP Abortive Termination" in Section 2.4 

      ULP - Upper Layer Protocol. The protocol layer above the protocol 
          layer currently being referenced. The ULP for RDMA/DDP is 
          expected to be an OS, Application, adaptation layer, or 
          proprietary device.  The RDMA/DDP documents do not specify a 
          ULP - they provide a set of semantics that allow a ULP to be 
          designed to utilize RDMA/DDP. 

      ULP Payload - The ULP data that is contained within a single 
          protocol segment or packet (e.g., a DDP Segment). 

      Verbs - An abstract description of the functionality of a RNIC 
          Interface. The OS may expose some or all of this functionality 
          via one or more APIs to applications. The OS will also use 
          some of the functionality to manage the RNIC Interface. 

   2.2 LLP 

      LLP - Lower Layer Protocol. The protocol layer beneath the 
          protocol layer currently being referenced. For example, for 
          DDP the LLP is SCTP, MPA, or other transport protocols. For 
          RDMA, the LLP is DDP. 

      LLP Connection - Corresponds to an LLP transport-level connection 
          between the peer LLP layers on two nodes.  

      LLP Stream - Corresponds to a single LLP transport-level Stream 
          between the peer LLP layers on two Nodes. One or more LLP 
          Streams may map to a single transport-level LLP connection. 
          For transport protocols that support multiple Streams per 
          connection (e.g., SCTP), a LLP Stream corresponds to one 
          transport-level Stream. 

      MULPDU - Maximum ULPDU. The current maximum size of the record 
          that is acceptable for DDP to pass to the LLP for 
          transmission. 

    
    
                           Expires January, 2007               [Page 16] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      ULPDU - Upper Layer Protocol Data Unit.  The data record defined 
          by the layer above MPA. 

   2.3 Direct Data Placement (DDP) 

      Data Placement (Placement, Placed, Places) - For DDP, this term is 
          specifically used to indicate the process of writing to a data 
          buffer by a DDP implementation.  DDP Segments carry Placement 
          information, which may be used by the receiving DDP 
          implementation to perform Data Placement of the DDP Segment 
          ULP Payload. See "Data Delivery". 

      DDP Abortive Teardown - The act of closing a DDP Stream without 
          attempting to Complete in-progress and pending DDP Messages. 

      DDP Graceful Teardown - The act of closing a DDP Stream such that 
          all in-progress and pending DDP Messages are allowed to 
          Complete successfully. 

      DDP Control Field - a fixed 16-bit field in the DDP Header. The 
          DDP Control Field contains an 8-bit field whose contents are 
          reserved for use by the ULP.  

      DDP Header - The header present in all DDP segments. The DDP 
          Header contains control and Placement fields that are used to 
          define the final Placement location for the ULP payload 
          carried in a DDP Segment. 

      DDP Message - A ULP defined unit of data interchange, which is 
          subdivided into one or more DDP segments. This segmentation 
          may occur for a variety of reasons, including segmentation to 
          respect the maximum segment size of the underlying transport 
          protocol. 

      DDP Segment - The smallest unit of data transfer for the DDP 
          protocol. It includes a DDP Header and ULP Payload (if 
          present). A DDP Segment should be sized to fit within the 
          underlying transport protocol MULPDU. 

      DDP Stream - a sequence of DDP Messages whose ordering is defined 
          by the LLP. For SCTP, a DDP Stream maps directly to an SCTP 
          Stream. For MPA, a DDP Stream maps directly to a TCP 

    
    
                           Expires January, 2007               [Page 17] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          connection and a single DDP Stream is supported.  Note that 
          DDP has no ordering guarantees between DDP Streams. 

      Direct Data Placement  - A mechanism whereby ULP data contained 
          within DDP Segments may be Placed directly into its final 
          destination in memory without processing of the ULP. This may 
          occur even when the DDP Segments arrive out of order. Out of 
          order Placement support may require the Data Sink to implement 
          the LLP and DDP as one functional block. 

      Direct Data Placement Protocol (DDP) - Also, a wire protocol that 
          supports Direct Data Placement by associating explicit memory 
          buffer placement information with the LLP payload units. 

      Message Offset (MO) - For the DDP Untagged Buffer Model, specifies 
          the offset, in bytes, from the start of a DDP Message. 

      Message Sequence Number (MSN) - For the DDP Untagged Buffer Model, 
          specifies a sequence number that is increasing with each DDP 
          Message. 

      Queue Number (QN) - For the DDP Untagged Buffer Model, identifies 
          a destination Data Sink queue for a DDP Segment. 

      Steering Tag - An identifier of a Tagged Buffer on a Node, valid 
          as defined within a protocol specification. 

      STag - Steering Tag 

      Tagged Buffer - A buffer that is explicitly Advertised to the 
          Remote Peer through exchange of an STag, Tagged Offset, and 
          length.  

      Tagged Buffer Model - A DDP data transfer model used to transfer 
          Tagged Buffers from the Local Peer to the Remote Peer. 

      Tagged DDP Message - A DDP Message that targets a Tagged Buffer. 

      Tagged Offset (TO) - The offset within a Tagged Buffer on a Node.  

      Untagged Buffer - A buffer that is not explicitly Advertised to 
          the Remote Peer. Untagged buffers support one of the two 
          available data transfer mechanisms called the Untagged Buffer 
    
    
                           Expires January, 2007               [Page 18] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          Model. An untagged buffer is used to send asynchronous control 
          messages to the Remote Peer for RDMA Read, Send, and Terminate 
          requests. Untagged Buffers handle Untagged DDP Messages. 

      Untagged Buffer Model - A DDP data transfer model used to transfer 
          Untagged Buffers from the Local Peer to the Remote Peer. 

      Untagged DDP Message - A DDP Message that targets an Untagged 
          Buffer. 

   2.4 Remote Direct Memory Access (RDMA)  

      Event - An indication provided by the RDMAP Layer to the ULP to 
          indicate a Completion or other condition requiring immediate 
          attention. 

      Invalidate STag - A mechanism used to prevent the Remote Peer from 
          reusing a previous explicitly Advertised STag, until the Local 
          Peer makes it available through a subsequent explicit 
          Advertisement. The STag cannot be accessed remotely until it 
          is explicit Advertised again. 

      RDMA Completion (Completion, Completed, Complete, Completes) - For 
          RDMA, Completion is defined as the process of informing the 
          ULP that a particular RDMA Operation has performed all 
          functions specified for the RDMA Operations, including 
          Placement and Delivery.  The Completion semantic of each RDMA 
          Operation is distinctly defined. 

      RDMA Message - A data transfer mechanism used to fulfill an RDMA 
          Operation. 

      RDMA Operation - A sequence of RDMA Messages, including control 
          Messages, to transfer data from a Data Source to a Data Sink. 
          The following RDMA Operations are defined - RDMA Writes, RDMA 
          Read, Send, Send with Invalidate, Send with Solicited Event, 
          Send with Solicited Event and Invalidate, and Terminate. 

      RDMA Protocol (RDMAP) - A wire protocol that supports RDMA 
          Operations to transfer ULP data between a Local Peer and the 
          Remote Peer. 

    
    
                           Expires January, 2007               [Page 19] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      RDMAP Abortive Termination (Termination, Terminated, Terminate, 
          Terminates) - The act of closing an RDMAP Stream without 
          attempting to Complete in-progress and pending RDMA 
          Operations. 

      RDMAP Graceful Termination - The act of closing an RDMAP Stream 
          such that all in-progress and pending RDMA Operations are 
          allowed to Complete successfully. 

      RDMA Read - An RDMA Operation used by the Data Sink to transfer 
          the contents of a source RDMA buffer from the Remote Peer to 
          the Local Peer. An RDMA Read operation consists of a single 
          RDMA Read Request Message and a single RDMA Read Response 
          Message. 

      RDMA Read Request - An RDMA Message used by the Data Sink to 
          request the Data Source to transfer the contents of an RDMA 
          buffer. The RDMA Read Request Message describes both the Data 
          Source and Data Sink RDMA buffers. 

      RDMA Read Request Queue - The queue used for processing RDMA Read 
          Requests. The RDMA Read Request Queue has a DDP Queue Number 
          of 1. 

      RDMA Read Response - An RDMA Message used by the Data Source to 
          transfer the contents of an RDMA buffer to the Data Sink, in 
          response to an RDMA Read Request. The RDMA Read Response 
          Message only describes the data sink RDMA buffer. 

      RDMAP Stream - An association between a pair of RDMAP 
          implementations, possibly on different Nodes, which transfer 
          ULP data using RDMA Operations. There may be multiple RDMAP 
          Streams on a single Node. An RDMAP Stream maps directly to a 
          single DDP Stream. 

      RDMA Write - An RDMA Operation that transfers the contents of a 
          source RDMA Buffer from the Local Peer to a destination RDMA 
          Buffer at the Remote Peer using RDMA. The RDMA Write Message 
          only describes the Data Sink RDMA buffer. 

      Remote Direct Memory Access (RDMA) - A method of accessing memory 
          on a remote system in which the local system specifies the 
          remote location of the data to be transferred. Employing a 
    
    
                           Expires January, 2007               [Page 20] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          RNIC in the remote system allows the access to take place 
          without interrupting the processing of the CPU(s) on the 
          system. 

      Send - An RDMA Operation that transfers the contents of a ULP 
          Buffer from the Local Peer to an Untagged Buffer at the Remote 
          Peer. 

      Send Message Type - A Send Message, Send with Invalidate Message, 
          Send with Solicited Event Message, or Send with Solicited 
          Event and Invalidate Message. 

      Send Operation Type - A Send Operation, Send with Invalidate 
          Operation, Send with Solicited Event Operation, or Send with 
          Solicited Event and Invalidate Operation. 

      Solicited Event (SE) - A facility by which an RDMA Operation 
          sender may cause an Event to be generated at the recipient, if 
          the recipient is configured to generate such an Event, when a 
          Send with Solicited Event or Send with Solicited Event and 
          Invalidate Message is received.  Note: The Local Peer's ULP 
          can use the Solicited Event mechanism to ensure that Messages 
          designated as important to the ULP are handled in an 
          expeditious manner by the Remote Peer's ULP. The ULP at the 
          Local Peer can indicate a given Send Message Type is important 
          by using the Send with Solicited Event Message or Send with 
          Solicited Event and Invalidate Message. The ULP at the Remote 
          Peer can choose to only be notified when valid Send with 
          Solicited Event Messages and/or Send with Solicited Event and 
          Invalidate Messages arrive and handle other valid incoming 
          Send Messages or Send with Invalidate Messages at its leisure. 

      Terminate - An RDMA Message used by a Node to pass an error 
          indication to the peer Node on an RDMAP Stream. This operation 
          is for RDMAP use only. 

      ULP Buffer - A buffer owned above the RDMAP Layer and advertised 
          to the RDMAP Layer either as a Tagged Buffer or an Untagged 
          ULP Buffer. 

      ULP Message - The ULP data that is handed to a specific protocol 
          layer for transmission. Data boundaries are preserved as they 
          are transmitted through iWARP. 
    
    
                           Expires January, 2007               [Page 21] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    3  ULP and Transport Attributes 

   3.1  Transport Requirements & Assumptions 

      RDMAP MUST be layered on top of the Direct Data Placement Protocol 
      [DDP]. 

      RDMAP requires the following DDP support: 

      *  RDMAP uses three queues for Untagged Buffers: 

          *   Queue Number 0 (used by RDMAP for Send, Send with 
              Invalidate, Send with Solicited Event, and Send with 
              Solicited Event and Invalidate operations). 

          *   Queue Number 1 (used by RDMAP for RDMA Read operations). 

          *   Queue Number 2 (used by RDMAP for Terminate operations). 

      *  DDP maps a single RDMA Message to a single DDP Message. 

      *  DDP uses the STag and Tagged Offset provided by the RDMAP for 
         Tagged Buffer Messages (i.e., RDMA Write and RDMA Read 
         Response). 

      *  When the DDP layer Delivers an Untagged DDP Message to the 
         RDMAP layer, DDP provides the length of the DDP Message. This 
         ensures that RDMAP does not have to carry a length field in its 
         header. 

      *  When the RDMAP layer provides an RDMA Message to the DDP Layer, 
         DDP must insert the RsvdULP field value provided by the RDMAP 
         Layer into the associated DDP Message. 

      *  When the DDP layer Delivers a DDP Message to the RDMAP layer, 
         DDP provides the RsvdULP field. 

      *  The RsvdULP field must be 1 octet for DDP Tagged Messages and 5 
         octets for DDP Untagged Messages. 

      *  DDP propagates to RDMAP all operation or protection errors 
         (used by RDMAP Terminate) and, when appropriate, the DDP Header 
         fields of the DDP Segment that encountered the error.  
    
    
                           Expires January, 2007               [Page 22] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  If an RDMA Operation is aborted by DDP or a lower layer, the 
         contents of the Data Sink buffers associated with the operation 
         are considered indeterminate. 

      *  DDP in conjunction with the lower layers provide reliable, in-
         order Delivery. 

   3.2  RDMAP Interactions with the ULP 

      RDMAP provides the ULP with access to the following RDMA 
      Operations as defined in this specification: 

      *  Send 

      *  Send with Solicited Event 

      *  Send with Invalidate 

      *  Send with Solicited Event and Invalidate 

      *  RDMA Write 

      *  RDMA Read 

      For Send Operation Types, the following are the interactions 
      between the RDMAP Layer and the ULP: 

      *  At the Data Source: 

          *   The ULP passes to the RDMAP Layer the following: 

              *   ULP Message Length 

              *   ULP Message 

              *   An indication of the Send Operation Type, where the 
                  valid types are: Send, Send with Solicited Event, Send 
                  with Invalidate, or Send with Solicited Event and 
                  Invalidate. 

              *   An Invalidate STag, if the Send Operation Type was 
                  Send with Invalidate or Send with Solicited Event and 
                  Invalidate.  
    
    
                           Expires January, 2007               [Page 23] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          *   When the Send Operation Type Completes, an indication of 
              the Completion results.  

      *  At the Data Sink: 

          *   If the Send Operation Type Completed successfully, the 
              RDMAP Layer passes the following information to the ULP 
              Layer: 

              *   ULP Message Length 

              *   ULP Message 

              *   An Event, if the Data Sink is configured to generate 
                  an Event. 

              *   An Invalidated STag, if the Send Operation Type was 
                  Send with Invalidate or Send with Solicited Event and 
                  Invalidate. 

          *   If the Send Operation Type Completed in error, the Data 
              Sink RDMAP Layer will pass up the corresponding error 
              information to the Data Sink ULP and send a Terminate 
              Message to the Data Source RDMAP Layer. The Data Source 
              RDMAP Layer will then pass up the Terminate Message to the 
              ULP. 

      For RDMA Write Operations, the following are the interactions 
      between the RDMAP Layer and the ULP: 

      *  At the Data Source: 

          *   The ULP passes to the RDMAP Layer the following: 

              *   ULP Message Length 

              *   ULP Message 

              *   Data Sink STag 

              *   Data Sink Tagged Offset 

    
    
                           Expires January, 2007               [Page 24] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          *   When the RDMA Write Operation Completes, an indication of 
              the Completion results. 

      *  At the Data Sink: 

          *   If the RDMA Write completed successfully, the RDMAP Layer 
              does not Deliver the RDMA Write to the ULP. It does Place 
              the ULP Message transferred through the RDMA Write Message 
              into the ULP Buffer. 

          *   If the RDMA Write completed in error, the Data Sink RDMAP 
              Layer will pass up the corresponding error information to 
              the Data Sink ULP and send a Terminate Message to the Data 
              Source RDMAP Layer. The Data Source RDMAP Layer will then 
              pass up the Terminate Message to the ULP. 

      For RDMA Read Operations, the following are the interactions 
      between the RDMAP Layer and the ULP: 

      *  At the Data Sink: 

          *   The ULP passes to the RDMAP Layer the following: 

              *   ULP Message Length 

              *   Data Source STag 

              *   Data Sink STag 

              *   Data Source Tagged Offset 

              *   Data Sink Tagged Offset 

          *   When the RDMA Read Operation Completes, an indication of 
              the Completion results. 

      *  At the Data Source: 

          *   If no error occurred while processing the RDMA Read 
              Request, the Data Source will not pass up any information 
              to the ULP.  

    
    
                           Expires January, 2007               [Page 25] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          *   If an error occurred while processing the RDMA Read 
              Request, the Data Source RDMAP Layer will pass up the 
              corresponding error information to the Data Source ULP and 
              send a Terminate Message to the Data Sink RDMAP Layer. The 
              Data Sink RDMAP Layer will then pass up the Terminate 
              Message to the ULP. 

      For STags made available to the RDMAP Layer, following are the 
      interactions between the RDMAP Layer and the ULP:  

      *  If the ULP enables an STag, the ULP passes to the RDMAP Layer 
         the: 

          *   STag; 

          *   range of Tagged Offsets that are associated with a given 
              STag; 

          *   remote access rights (read, write, or read and write) 
              associated with a given, valid STag; and 

          *   association between a given STag and a given RDMAP Stream. 

      *  If the ULP disables an STag, the ULP passes to the RDMAP Layer 
         the STag. 

      If an error occurs at the RDMAP Layer, the RDMAP Layer may pass 
      back error information (e.g., the content of a Terminate Message) 
      to the ULP.  

    
    
                           Expires January, 2007               [Page 26] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    4  Header Format 

      The control information of RDMA Messages is included in DDP 
      protocol defined header fields, with the following exceptions: 

      *  The first octet reserved for ULP usage on all DDP Messages in 
         the DDP Protocol (i.e., the RsvdULP Field) is used by RDMAP to 
         carry the RDMA Message Opcode and the RDMAP version. This octet 
         is known as the RDMAP Control Field in this specification. For 
         Send with Invalidate and Send with Solicited Event and 
         Invalidate, RDMAP uses the second through fifth octets provided 
         by DDP on Untagged DDP Messages to carry the STag that will be 
         Invalidated. 

      *  The RDMA Message length is passed by the RDMAP layer to the DDP 
         layer on all outbound transfers. 

      *  For RDMA Read Request Messages, the RDMA Read Message Size is 
         included in the RDMA Read Request Header. 

      *  The RDMA Message length is passed to the RDMAP Layer by the DDP 
         layer on inbound Untagged Buffer transfers. 

      *  Two RDMA Messages carry additional RDMAP headers. The RDMA Read 
         Request carries the Data Sink and Data Source buffer 
         descriptions, including buffer length. The Terminate carries 
         additional information associated with the error that caused 
         the Terminate. 

   4.1  RDMAP Control and Invalidate STag Field 

      The version of RDMAP defined by this specification uses all 8 bits 
      of the RDMAP Control Field. The first octet reserved for ULP use 
      in the DDP Protocol MUST be used by the RDMAP to carry the RDMAP 
      Control Field. The ordering of the bits in the first octet MUST be 
      as defined in Figure 3 DDP Control, RDMAP Control, and Invalidate 
      STag Field. For Send with Invalidate and Send with Solicited Event 
      and Invalidate, the second through fifth octets of the DDP RsvdULP 
      field MUST be used by RDMAP to carry the Invalidate STag. Figure 3 
      DDP Control, RDMAP Control, and Invalidate STag Field depicts the 
      format of the DDP Control and RDMAP Control fields. (Note: In 
      Figure 3 DDP Control, RDMAP Control, and Invalidate STag Field, 
      the DDP Header is offset by 16 bits to accommodate the MPA header 
    
    
                           Expires January, 2007               [Page 27] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      defined in [MPA]. The MPA header is only present if DDP is layered 
      on top of MPA.) 

    
       0                   1                   2                   3 
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                      |T|L| Resrv | DV| RV|Rsv| Opcode|  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      |                     Invalidate STag                           | 
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields  

      All RDMA Messages handed by the RDMAP Layer to the DDP layer MUST 
      define the value of the Tagged flag in the DDP Header. Figure 4 
      RDMA Usage of DDP Fields MUST be used to define the value of the 
      Tagged flag that is handed to the DDP Layer for each RDMA Message. 

      Figure 4 RDMA Usage of DDP Fields defines the value of the RDMA 
      Opcode field that MUST be used for each RDMA Message. 

      Figure 4 RDMA Usage of DDP Fields defines when the STag, Queue 
      Number, and Tagged Offset fields MUST be provided for each RDMA 
      Message. 

      For this version of the RDMAP, all RDMA Messages MUST have: 

      *  Bits 24-25; RDMA Version field: 01b for an RNIC that complies 
         with this RDMA protocol specification. 00b for an RNIC that 
         complies with the RDMA Consortium's RDMA protocol 
         specification.  Both version numbers are valid. 
         Interoperability is dependent on MPA protocol version 
         negotiation (e.g., MPA marker and MPA CRC). 

      *  Bits 26-27; Reserved. MUST be set to zero by sender, ignored by 
         the receiver. 

      *  Bits 28-31; OpCode field: see Figure 4 RDMA Usage of DDP 
         Fields. 

      *  Bits 32-63; Invalidate STag. However, this field is only valid 
         for Send with Invalidate and Send with Solicited Event and 
         Invalidate Messages (see Figure 4 RDMA Usage of DDP Fields).  
    
    
                           Expires January, 2007               [Page 28] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
         For Send, Send with Solicited Event, RDMA Read Request, and 
         Terminate, the Invalidate STag field MUST be set to zero on 
         transmit and ignored by the receiver. 

   -------+-----------+-------+------+-------+-----------+-------------- 
   RDMA   | Message   | Tagged| STag | Queue | Invalidate| Message 
   Message| Type      | Flag  | and  | Number| STag      | Length 
   OpCode |           |       | TO   |       |           | Communicated 
          |           |       |      |       |           | between DDP 
          |           |       |      |       |           | and RDMAP 
   -------+-----------+-------+------+-------+-----------+-------------- 
   0000b  | RDMA Write| 1     | Valid| N/A   | N/A       | Yes 
          |           |       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   0001b  | RDMA Read | 0     | N/A  | 1     | N/A       | Yes 
          | Request   |       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   0010b  | RDMA Read | 1     | Valid| N/A   | N/A       | Yes 
          | Response  |       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   0011b  | Send      | 0     | N/A  | 0     | N/A       | Yes 
          |           |       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   0100b  | Send with | 0     | N/A  | 0     | Valid     | Yes 
          | Invalidate|       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   0101b  | Send with | 0     | N/A  | 0     | N/A       | Yes 
          | SE        |       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   0110b  | Send with | 0     | N/A  | 0     | Valid     | Yes 
          | SE and    |       |      |       |           |  
          | Invalidate|       |      |       |           | 
   -------+-----------+-------+------+-------+-----------+-------------- 
   0111b  | Terminate | 0     | N/A  | 2     | N/A       | Yes 
          |           |       |      |       |           |  
   -------+-----------+-------+------+-------+-----------+-------------- 
   1000b  |           |  
   to     | Reserved  |               Not Specified 
   1111b  |           |  
   -------+-----------+------------------------------------------------- 
      Figure 4 RDMA Usage of DDP Fields  

      Note:  N/A means Not Applicable. 
    
    
                           Expires January, 2007               [Page 29] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   4.2  RDMA Message Definitions 

      The following figure defines which RDMA Headers MUST be used on 
      each RDMA Message and which RDMA Messages are allowed to carry ULP 
      payload: 

    
    
                           Expires January, 2007               [Page 30] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   -------+-----------+-------------------+------------------------- 
   RDMA   | Message   | RDMA Header Used  | ULP Message allowed in  
   Message| Type      |                   | the RDMA Message 
   OpCode |           |                   |  
          |           |                   | 
   -------+-----------+-------------------+------------------------- 
   0000b  | RDMA Write| None              | Yes 
          |           |                   |  
   -------+-----------+-------------------+------------------------- 
   0001b  | RDMA Read | RDMA Read Request | No 
          | Request   | Header            | 
   -------+-----------+-------------------+------------------------- 
   0010b  | RDMA Read | None              | Yes 
          | Response  |                   | 
   -------+-----------+-------------------+------------------------- 
   0011b  | Send      | None              | Yes 
          |           |                   |  
   -------+-----------+-------------------+------------------------- 
   0100b  | Send with | None              | Yes 
          | Invalidate|                   |  
   -------+-----------+-------------------+------------------------- 
   0101b  | Send with | None              | Yes 
          | SE        |                   |  
   -------+-----------+-------------------+------------------------- 
   0110b  | Send with | None              | Yes 
          | SE and    |                   |  
          | Invalidate|                   | 
   -------+-----------+-------------------+------------------------- 
   0111b  | Terminate | Terminate Header  | No 
          |           |                   |  
   -------+-----------+-------------------+------------------------- 
   1000b  |           |             
   to     | Reserved  |            Not Specified 
   1111b  |           |             
   -------+-----------+-------------------+------------------------- 
      Figure 5 RDMA Message Definitions 

   4.3  RDMA Write Header 

      The RDMA Write Message does not include an RDMAP header. The RDMAP 
      layer passes to the DDP layer an RDMAP Control Field. The RDMA 
      Write Message is fully described by the DDP Headers of the DDP 
      Segments associated with the Message. 
    
    
                           Expires January, 2007               [Page 31] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      See section 11 Appendix for a description of the DDP Segment 
      format associated with RDMA Write Messages. 

   4.4  RDMA Read Request Header 

      The RDMA Read Request Message carries an RDMA Read Request Header 
      that describes the Data Sink and Data Source Buffers used by the 
      RDMA Read operation. The RDMA Read Request Header immediately 
      follows the DDP header. The RDMAP layer passes to the DDP layer an 
      RDMAP Control Field. The following figure depicts the RDMA Read 
      Request Header that MUST be used for all RDMA Read Request 
      Messages: 

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                     Data Sink STag (SinkSTag)                 | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       +                  Data Sink Tagged Offset (SinkTO)             + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                  RDMA Read Message Size (RDMARDSZ)            | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                     Data Source STag (SrcSTag)                | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       +                 Data Source Tagged Offset (SrcTO)             + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 6 RDMA Read Request Header Format 

      Data Sink Steering Tag: 32 bits. 

           The Data Sink Steering Tag identifies the Data Sink's Tagged 
           Buffer. This field MUST be copied, without interpretation, 
           from the RDMA Read Request into the corresponding RDMA Read 
           Response and allows the Data Sink to place the returning 
           data. The STag is associated with the RDMAP Stream through a 
           mechanism that is outside the scope of the RDMAP 
           specification. 

      Data Sink Tagged Offset: 64 bits. 
    
    
                           Expires January, 2007               [Page 32] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
           The Data Sink Tagged Offset specifies the starting offset, in 
           octets, from the base of the Data Sink's Tagged Buffer, where 
           the data is to be written by the Data Source. This field is 
           copied from the RDMA Read Request into the corresponding RDMA 
           Read Response and allows the Data Sink to place the returning 
           data. The Data Sink Tagged Offset MAY start at an arbitrary 
           offset.  

           The Data Sink STag and Data Sink Tagged Offset fields 
           describe the buffer to which the RDMA Read data is written. 

           Note: the DDP Layer protects against a wrap of the Data Sink 
           Tagged Offset.  

      RDMA Read Message Size: 32 bits. 

           The RDMA Read Message Size is the amount of data, in octets, 
           read from the Data Source. A single RDMA Read Request Message 
           can retrieve from 0 to 2^32-1 data octets from the Data 
           Source. 

      Data Source Steering Tag: 32 bits. 

           The Data Source Steering Tag identifies the Data Source's 
           Tagged Buffer. The STag is associated with the RDMAP Stream 
           through a mechanism that is outside the scope of the RDMAP 
           specification. 

      Data Source Tagged Offset: 64 bits. 

           The Tagged Offset specifies the starting offset, in octets, 
           that is to be read from the Data Source's Tagged Buffer. The 
           Data Source Tagged Offset MAY start at an arbitrary offset.  

           The Data Source STag and Data Source Tagged Offset fields 
           describe the buffer from which the RDMA Read data is read. 

      See Section 7.2 Errors Detected at the Remote Peer on Incoming 
      RDMA Messages for a description of error checking required upon 
      processing of an RDMA Read Request at the Data Source. 

    
    
                           Expires January, 2007               [Page 33] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   4.5  RDMA Read Response Header 

      The RDMA Read Response Message does not include an RDMAP header. 
      The RDMAP layer passes to the DDP layer an RDMAP Control Field. 
      The RDMA Read Response Message is fully described by the DDP 
      Headers of the DDP Segments associated with the Message. 

      See Section 11 Appendix for a description of the DDP Segment 
      format associated with RDMA Read Response Messages. 

   4.6  Send Header and Send with Solicited Event Header 

      The Send and Send with Solicited Event Message do not include an 
      RDMAP header. The RDMAP layer passes to the DDP layer an RDMAP 
      Control Field. The Send and Send with Solicited Event Message are 
      fully described by the DDP Headers of the DDP Segments associated 
      with the Message. 

      See Section 11 Appendix for a description of the DDP Segment 
      format associated with Send and Send with Solicited Event 
      Messages. 

   4.7 Send with Invalidate Header and Send with SE and Invalidate 
        Header 

      The Send with Invalidate and Send with Solicited Event and 
      Invalidate Message do not include an RDMAP header. The RDMAP layer 
      passes to the DDP layer an RDMAP Control Field and the Invalidate 
      STag field (see section 4.1 RDMAP Control and Invalidate STag 
      Field). The Send with Invalidate and Send with Solicited Event and 
      Invalidate Message are fully described by the DDP Headers of the 
      DDP Segments associated with the Message. 

      See Section 11 Appendix for a description of the DDP Segment 
      format associated with Send and Send with Solicited Event 
      Messages. 

   4.8  Terminate Header 

      The Terminate Message carries a Terminate Header that contains 
      additional information associated with the cause of the Terminate. 
      The Terminate Header immediately follows the DDP header. The RDMAP 
      layer passes to the DDP layer an RDMAP Control Field. The 
    
    
                           Expires January, 2007               [Page 34] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      following figure depicts a Terminate Header that MUST be used for 
      the Terminate Message: 

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |       Terminate Control             |      Reserved           | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |  DDP Segment Length  (if any) |                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               + 
       |                                                               | 
       //                                                             // 
       |                  Terminated DDP Header (if any)               | 
       +                                                               + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       //                                                             // 
       |                 Terminated RDMA Header (if any)               | 
       +                                                               + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 7 Terminate Header Format 

       

      Terminate Control: 19 bits. 

          The Terminate Control field MUST have the format defined in 
          Figure 8 Terminate Control Field. 

    
        0                   1                   2                   3 
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       | Layer | EType |   Error Code  |HdrCt|  
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 8 Terminate Control Field  

          *   Figure 9 Terminate Control Field Values defines the valid 
              values that MUST be used for this field. 

              *   Layer: 4 bits. 
    
    
                           Expires January, 2007               [Page 35] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
                  Identifies the layer that encountered the error. 

              *   EType (RDMA Error Type): 4 bits. 

                  Identifies the type of error that caused the 
                  Terminate. When the error is detected at the RDMAP 
                  Layer, the RDMAP Layer inserts the Error Type into 
                  this field. When the error is detected at a LLP layer, 
                  a LLP layer creates the Error Type and the DDP layer 
                  passes it up to the RDMAP Layer, and the RDMAP Layer 
                  inserts it into this field. 

              *   Error Code: 8 bits. 

                  This field identifies the specific error that caused 
                  the Terminate. When the error is detected at the RDMAP 
                  Layer, the RDMAP Layer creates the Error Code. When 
                  the error is detected at a LLP layer, a LLP layer 
                  creates the Error Code and the DDP layer passes it up 
                  to the RDMAP Layer, and the RDMAP Layer inserts it 
                  into this field. 

              *   HdrCt: 3 bits. 

                  Header control bits: 

                  *   M: bit 16. DDP Segment Length valid. See Figure 10 
                      for when this bit SHOULD be set. 

                  *   D: bit 17. DDP Header Included. See Figure 10 for 
                      when this bit SHOULD be set. 

                  *   R: bit 18. RDMAP Header Included. See Figure 10 
                      for when this bit SHOULD be set. 

    
    
                           Expires January, 2007               [Page 36] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   -------+----------+-------+-------------+------+-------------------- 
   Layer  | Layer    | Error | Error Type  | Error| Error Code Name 
          | Name     | Type  | Name        | Code | 
   -------+----------+-------+-------------+------+-------------------- 
          |          | 0000b | Local       | None | None - This error 
          |          |       | Catastrophic|      | type does not have  
          |          |       | Error       |      | an error code. Any 
          |          |       |             |      | value in this field 
          |          |       |             |      | is acceptable. 
          |          +-------+-------------+------+-------------------- 
          |          |       |             | 00X  | Invalid STag 
          |          |       |             +------+-------------------- 
          |          |       |             | 01X  | Base or bounds 
          |          |       |             |      | violation 
          |          |       | Remote      +------+-------------------- 
          |          | 0001b | Protection  | 02X  | Access rights 
          |          |       | Error       |      | violation 
          |          |       |             +------+-------------------- 
   0000b  | RDMA     |       |             | 03X  | STag not associated 
          |          |       |             |      | with RDMAP Stream 
          |          |       |             +------+-------------------- 
          |          |       |             | 04X  | TO wrap 
          |          |       |             +------+-------------------- 
          |          |       |             | 09X  | STag cannot be 
          |          |       |             |      | Invalidated 
          |          |       |             +------+-------------------- 
          |          |       |             | FFX  | Unspecified Error 
          |          +-------+-------------+------+-------------------- 
          |          |       |             | 05X  | Invalid RDMAP 
          |          |       |             |      | version 
          |          |       |             +------+-------------------- 
          |          |       |             | 06X  | Unexpected OpCode 
          |          |       | Remote      +------+-------------------- 
          |          | 0010b | Operation   | 07X  | Catastrophic error, 
          |          |       | Error       |      | localized to RDMAP 
          |          |       |             |      | Stream  
          |          |       |             +------+-------------------- 
          |          |       |             | 08X  | Catastrophic error, 
          |          |       |             |      | global  
          |          |       |             +------+-------------------- 
          |          |       |             | 09X  | STag cannot be 
          |          |       |             |      | Invalidated 
          |          |       |             +------+-------------------- 
    
    
                           Expires January, 2007               [Page 37] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          |          |       |             | FFX  | Unspecified Error 
   -------+----------+-------+-------------+------+-------------------- 
   0001b  | DDP      | See DDP Specification [DDP] for a description of 
          |          | the values and names. 
   -------+----------+-------+----------------------------------------- 
   0010b  | LLP      | For MPA, see MPA Specification [MPA] for a  
          | (eg MPA) | description of the values and names. 
   -------+----------+-------+----------------------------------------- 
      Figure 9 Terminate Control Field Values 

      Reserved: 13 bits. This field MUST be set to zero on transmit, 
      ignored on receive. 

      DDP Segment Length: 16 bits 

           The length handed up by the DDP Layer when the error was 
           detected. It MUST be valid if the M bit is set. It MUST be 
           present when the D bit is set.  

      Terminated DDP Header: 112 bits for Tagged Messages and 144 bits 
      for Untagged Messages. 

           The DDP Header of the incoming Message that is associated 
           with the Terminate. The DDP Header is not present if the 
           Terminate Error Type is a Local Catastrophic Error. It MUST 
           be present if the D bit is set. 

      Terminated RDMA Header: 224 bits. 

           The Terminated RDMA Header is only sent back if the terminate 
           is associated with an RDMA Read Request Message. It MUST be 
           present if the R bit is set. 

           If the terminate occurs before the first RDMA Read Request 
           byte is processed, the original RDMA Read Request Header is 
           sent back. 

           If the terminate occurs after the first RDMA Read Request 
           byte is processed, the RDMA Read Request Header is updated to 
           reflect the current location of the RDMA Read operation that 
           is in process: 

    
    
                           Expires January, 2007               [Page 38] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
              *   Data Sink STag = Data Sink STag originally sent in the 
                  RDMA Read Request. 

              *   Data Sink Tagged Offset = Current offset into the Data 
                  Sink Tagged Buffer. For example if the RDMA Read 
                  Request was terminated after 2048 octets were sent, 
                  then the Data Sink Tagged Offset = the original Data 
                  Sink Tagged Offset + 2048.  

              *   Data Message size = Number of bytes left to transfer. 

              *   Data Source STag = Data Source STag in the RDMA Read 
                  Request. 

              *   Data Source Tagged Offset = Current offset into the 
                  Data Source Tagged Buffer. For example if the RDMA 
                  Read Request was terminated after 2048 octets were 
                  sent, then the Data Source Tagged Offset = the 
                  original Data Source Tagged Offset + 2048. 

      Note: if a given LLP does not define any termination codes for the 
      RDMAP Termination message to use, then none would be used for that 
      LLP. 

      Figure 10 Error Type to RDMA Message Mapping maps layer name and 
      error types to each RDMA Message type: 

    
    
                           Expires January, 2007               [Page 39] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   ---------+-------------+------------+------------+----------------- 
   Layer    | Error Type  | Terminate  | Terminate  | What type of 
   Name     | Name        | Includes   | Includes   | RDMA Message can 
            |             | DDP Header | RDMA Header| cause the error 
            |             | and DDP    |            | 
            |             | Segment    |            | 
            |             | Length     |            | 
   ---------+-------------+------------+------------+----------------- 
            | Local       | No         | No         | Any 
            | Catastrophic|            |            |  
            | Error       |            |            | 
            +-------------+------------+------------+----------------- 
            | Remote      | Yes, if    | Yes        | Only RDMA Read 
   RDMA     | Protection  | possible   |            | Request, Send 
            | Error       |            |            | with Invalidate, 
            |             |            |            | and Send with SE 
            |             |            |            | and Invalidate 
            +-------------+------------+------------+----------------- 
            | Remote      | Yes, if    | No         | Any 
            | Operation   | possible   |            | 
            | Error       |            |            | 
   ---------+-------------+------------+------------+----------------- 
   DDP      | See DDP Spec| Yes        | No         | Any 
            | [DDP]       |            |            | 
   ---------+-------------+------------+------------+----------------- 
   LLP      | See LLP Spec| No         | No         | Any 
            | [e.g., MPA] |            |            |  
      Figure 10 Error Type to RDMA Message Mapping 

    
    
                           Expires January, 2007               [Page 40] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    5  Data Transfer  

   5.1  RDMA Write Message 

      An RDMA Write is used by the Data Source to transfer data to a 
      previously Advertised Tagged Buffer at the Data Sink. The RDMA 
      Write Message has the following semantics: 

      *  An RDMA Write Message MUST reference a Tagged Buffer. That is, 
         the Data Source RDMAP Layer MUST request that the DDP layer 
         mark the Message as Tagged.  

      *  A valid RDMA Write Message MUST NOT be delivered to the Data 
         Sink's ULP (i.e., it is placed by the DDP layer).  

      *  At the Remote Peer, when an invalid RDMA Write Message is 
         delivered to the Remote Peer's RDMAP Layer, an error is 
         surfaced (see section 7.1 RDMAP Error Surfacing). 

      *  The Tagged Offset of a Tagged Buffer MAY start at a non-zero 
         value. 

      *  An RDMA Write Message MAY target all or part of a previously 
         Advertised buffer.  

      *  The RDMAP does not define how the buffer(s) used by an outbound 
         RDMA Write is defined and how it is addressed. For example, an 
         implementation of RDMA may choose to allow a gather-list of 
         non-contiguous data blocks to be the source of an RDMA Write. 
         In this case, the data blocks would be combined by the Data 
         Source and sent as a single RDMA Write Message to the Data 
         Sink. 

      *  The Data Source RDMAP Layer MUST issue RDMA Write Messages to 
         the DDP layer in the order they were submitted by the ULP. 

      *  At the Data Source, a subsequent Send (Send with Invalidate, 
         Send with Solicited Event, or Send with Solicited Event and 
         Invalidate) Message MAY be used to signal Delivery of previous 
         RDMA Write Messages to the Data Sink, if desired by the ULP. 

      *  If the Local Peer wishes to write to multiple Tagged Buffers on 
         the Remote Peer, the Local Peer MUST use multiple RDMA Write 
    
    
                           Expires January, 2007               [Page 41] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
         Messages. That is, a single RDMA Write Message can only write 
         to one remote Tagged Buffer.  

      *  The Data Source MAY issue a zero length RDMA Write Message. 

    

   5.2  RDMA Read Operation 

      The RDMA Read operation MUST consist of a single RDMA Read Request 
      Message and a single RDMA Read Response Message. 

   5.2.1  RDMA Read Request Message 

      An RDMA Read Request is used by the Data Sink to transfer data 
      from a previously Advertised Tagged Buffer at the Data Source to a 
      Tagged Buffer at the Data Sink. The RDMA Read Request Message has 
      the following semantics: 

      *  An RDMA Read Request Message MUST reference an Untagged Buffer. 
         That is, the Local Peer's RDMAP Layer MUST request that the DDP 
         mark the Message as Untagged.  

      *  One RDMA Read Request Message MUST consume one Untagged Buffer. 

      *  The Remote Peer's RDMAP Layer MUST process an RDMA Read Request 
         Message. A valid RDMA Read Request Message MUST NOT be 
         delivered to the Data Sink's ULP (i.e., it is processed by the 
         RDMAP layer). 

      *  At the Remote Peer, when an invalid RDMA Read Request Message 
         is delivered to the Remote Peer's RDMAP Layer, an error is 
         surfaced (see section 7.1 RDMAP Error Surfacing). 

      *  AN RDMA Read Request Message MUST reference the RDMA Read 
         Request Queue. That is, the Local Peer's RDMAP Layer MUST 
         request that the DDP layer set the Queue Number field to one. 

      *  The Local Peer MUST pass to the DDP Layer RDMA Read Request 
         Messages in the order they were submitted by the ULP. 

      *  The Remote Peer MUST process the RDMA Read Request Messages in 
         the order they were sent. 
    
    
                           Expires January, 2007               [Page 42] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  If the Local Peer wishes to read from multiple Tagged Buffers 
         on the Remote Peer, the Local Peer MUST use multiple RDMA Read 
         Request Messages. That is, a single RDMA Read Request Message 
         MUST only read from one remote Tagged Buffer. 

      *  AN RDMA Read Request Message MAY target all or part of a 
         previously Advertised buffer.  

      *  If the Data Source receives a valid RDMA Read Request Message 
         it MUST respond with a valid RDMA Read Response Message. 

      *  The Data Sink MAY issue a zero length RDMA Read Request 
         Message, by setting the RDMA Read Message Size field to zero in 
         the RDMA Read Request Header. 

      *  If the Data Source receives a non-zero length RDMA Read Message 
         Size, the Data Source RDMAP MUST validate the Data Source STag 
         and Data Source Tagged Offset contained in the RDMA Read 
         Request Header. 

      *  If the Data Source receives an RDMA Read Request Header with 
         the RDMA Read Message Size set to zero, the Data Source RDMAP: 

          *   MUST NOT validate the Data Source STag and Data Source 
              Tagged Offset contained in the RDMA Read Request Header, 
              and 

          *   MUST respond with a zero length RDMA Read Response 
              Message. 

   5.2.2  RDMA Read Response Message 

      The RDMA Read Response Message uses the DDP Tagged Buffer Model to 
      Deliver the contents of a previously requested Data Source Tagged 
      Buffer to the Data Sink, without any involvement from the ULP at 
      the Remote Peer. The RDMA Read Response Message has the following 
      semantics: 

      *  The RDMA Read Response Message for the associated RDMA Read 
         Request Message travels in the opposite direction. 

    
    
                           Expires January, 2007               [Page 43] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  An RDMA Read Response Message MUST reference a Tagged Buffer. 
         That is, the Data Source RDMAP Layer MUST request that the DDP 
         mark the Message as Tagged.  

      *  The Data Source MUST ensure that a sufficient number of 
         Untagged Buffers are available on the RDMA Read Request Queue 
         (Queue with DDP Queue Number 1) to support the maximum number 
         of RDMA Read Requests negotiated by the ULP. 

      *  The RDMAP Layer MUST Deliver the RDMA Read Response Message to 
         the ULP. 

      *  At the Remote Peer, when an invalid RDMA Read Response Message 
         is delivered to the Remote Peer's RDMAP Layer, an error is 
         surfaced (see section 7.1 RDMAP Error Surfacing). 

      *  The Tagged Offset of a Tagged Buffer MAY start at a non-zero 
         value. 

      *  The Data Source RDMAP Layer MUST pass RDMA Read Response 
         Messages to the DDP layer in the order that the RDMA Read 
         Request Messages were received by the RDMAP Layer at the Data 
         Source. 

      *  The Data Sink MAY validate that the STag, Tagged Offset, and 
         length of the RDMA Read Response Message are the same as the 
         STag, Tagged Offset, and length included in the corresponding 
         RDMA Read Request Message. 

      *  A single RDMA Read Response Message MUST write to one remote 
         Tagged Buffer. If the Data Sink wishes to Read multiple Tagged 
         Buffers, the Data Sink can use multiple RDMA Read Request 
         Messages. 

   5.3  Send Message Type 

      The Send Message Type uses the DDP Untagged Buffer Model to 
      transfer data from the Data Source into an Untagged Buffer at the 
      Data Sink.  

      *  A Send Message Type MUST reference an Untagged Buffer. That is, 
         the Local Peer's RDMAP Layer MUST request that the DDP layer 
         mark the Message as Untagged.  
    
    
                           Expires January, 2007               [Page 44] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  One Send Message Type MUST consume one Untagged Buffer. 

          *   The ULP Message sent using a Send Message Type MAY be less 
              than or equal to the size of the consumed Untagged Buffer. 
              The RDMAP Layer communicates to the ULP the size of the 
              data written into the Untagged Buffer.  

          *   If the ULP Message sent via Send Message Type is larger 
              than the Data Sink's Untagged Buffer, it is an error (see 
              section 9.1 RDMAP Error Surfacing). 

      *  At the Remote Peer, the Send Message Type MUST be Delivered to 
         the Remote Peer's ULP in the order they were sent. 

      *  After the Send with Solicited Event or Send with Solicited 
         Event and Invalidate Message is Delivered to the ULP, the RDMAP 
         MAY generate an Event, if the Data Sink is configured to 
         generate such an Event. 

      *  At the Remote Peer, when an invalid Send Message Type is 
         Delivered to the Remote Peer's RDMAP Layer, an error is 
         surfaced (see section 7.1 RDMAP Error Surfacing). 

      *  The RDMAP does not define how the buffer(s) used by an outbound 
         Send Message Type is defined and how it is addressed. For 
         example, an implementation of RDMA may choose to allow a 
         gather-list of non-contiguous data blocks to be the source of a 
         Send Message Type. In this case, the data blocks would be 
         combined by the Data Source and sent as a single Send Message 
         Type to the Data Sink. 

      *  For a Send Message Type, the Local Peer's RDMAP Layer MUST 
         request that the DDP layer set the Queue Number field to zero. 

      *  The Local Peer MUST issue Send Message Type Messages in the 
         order they were submitted by the ULP. 

      *  The Data Source MAY pass a zero length Send Message Type. A 
         zero length Send Message Type MUST consume an Untagged Buffer 
         at the Data Sink. A Send with Invalidate or Send with Solicited 
         Event and Invalidate Message MUST reference an STag. That is, 
         the Local Peer's RDMAP Layer MUST pass the RDMA control field 
         and the STag that will be Invalidated to the DDP layer. 
    
    
                           Expires January, 2007               [Page 45] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  When the Send with Invalidate and Send with Solicited Event and 
         Invalidate Message are Delivered to the Remote Peer's RDMAP 
         Layer, the RDMAP Layer MUST: 

          *   Verify the STag that is associated with the RDMAP Stream; 
              and 

          *   Invalidate the STag if it is associated with the RDMAP 
              Stream; or Issue a Terminate Message with the STag Cannot 
              be Invalidated Terminate Error Code, if the STag is not 
              associated with the RDMAP Stream. 

   5.4  Terminate Message 

      The Terminate Message uses the DDP Untagged Buffer Model to 
      transfer error related information from the Data Source into an 
      Untagged Buffer at the Data Sink and then ceases all further 
      communications on the underlying DDP Stream. The Terminate Message 
      has the following semantics: 

      *  A Terminate Message MUST reference an Untagged Buffer. That is, 
         the Local Peer's RDMAP Layer MUST request that the DDP layer 
         mark the Message as Untagged.  

      *  A Terminate Message references the Terminate Queue. That is, 
         the Local Peer's RDMAP Layer MUST request that the DDP layer 
         set the Queue Number field to two. 

      *  One Terminate Message MUST consume one Untagged Buffer.  

      *  On a single RDMAP Stream, the RDMAP layer MUST guarantee 
         placement of a single Terminate Message. 

      *  A Terminate Message MUST be Delivered to the Remote Peer's 
         RDMAP Layer. The RDMAP Layer MUST Deliver the Terminate Message 
         to the ULP.  

      *  At the Remote Peer, when an invalid Terminate Message is 
         delivered to the Remote Peer's RDMAP Layer, an error is 
         surfaced (see section 7.1 RDMAP Error Surfacing). 

      *  The RDMAP Layer Completes in error all ULP Operations that have 
         not been provided to the DDP layer. 
    
    
                           Expires January, 2007               [Page 46] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      *  After sending a Terminate Message on an RDMAP Stream, the Local 
         Peer MUST NOT send any more Messages on that specific RDMAP 
         Stream. 

      *  After receiving a Terminate Message on an RDMAP Stream, the 
         Remote Peer MAY stop sending Messages on that specific RDMAP 
         Stream. 

   5.5  Ordering and Completions 

      It is important to understand the difference between Placement and 
      Delivery ordering since RDMAP provides quite different semantics 
      for the two. 

      Note that many current protocols, both as used in the Internet and 
      elsewhere, assume that data is both Placed and Delivered in order.  
      This allowed applications to take a variety of shortcuts by taking 
      advantage of this fact.  For RDMAP, many of these shortcuts are no 
      longer safe to use, and could cause application failure. 

      The following rules apply to implementations of the RDMAP 
      protocol. Note, in these rules Send includes Send, Send with 
      Invalidate, Send with Solicited Event, and Send with Solicited 
      Event and Invalidate: 

      1.  RDMAP does not provide ordering among Messages on different 
          RDMAP Streams. 

      2.  RDMAP does not provide ordering between operations that are 
          generated from the two ends of an RDMAP Stream. 

      3.  RDMA Messages that use Tagged and Untagged Buffers MAY be 
          Placed in any order.  If an application uses overlapping 
          buffers (points different Messages or portions of a single 
          Message at the same buffer), then it is possible that the last 
          incoming write to the Data Sink buffer will not be the last 
          outgoing data sent from the Data Source. 

      4.  For a Send operation, the contents of an Untagged Buffer at 
          the Data Sink MAY be indeterminate until the Send is Delivered 
          to the ULP at the Data Sink. 

    
    
                           Expires January, 2007               [Page 47] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      5.  For an RDMA Write operation, the contents of the Tagged Buffer 
          at the Data Sink MAY be indeterminate until a subsequent Send 
          is Delivered to the ULP at the Data Sink. 

      6.  For an RDMA Read operation, the contents of the Tagged Buffer 
          at the Data Sink MAY be indeterminate until the RDMA Read 
          Response Message has been Delivered at the Local Peer. 

           Statements 4, 5, and 6 imply "no peeking" at the data to see 
           if it is done.  It is possible for some data to arrive before 
           logically earlier data does, and peeking may cause 
           unpredictable application failure 

      7.  If the ULP or Application modifies the contents of Tagged or 
          Untagged Buffers being modified by an RDMA Operation while the 
          RDMAP is processing the RDMA Operation, the state of the 
          Buffers is indeterminate. 

      8.  If the ULP or Application modifies the contents of Tagged or 
          Untagged Buffers read by an RDMA Operation while the RDMAP is 
          processing the RDMA Operation, the results of the read are 
          indeterminate. 

      9.  The Completion of an RDMA Write or Send Operation at the Local 
          Peer does not guarantee that the ULP Message has yet reached 
          the Remote Peer ULP Buffer or been examined by the Remote ULP. 

      10. Send Messages MUST be Delivered to the ULP at the Remote Peer 
          after they are Delivered to RDMAP by DDP and in the order that 
          the they were Delivered to RDMAP.  

          Note that DDP ordering rules ensure that this will be the same 
          order that they were submitted at the Local Peer and that any 
          prior RDMA Writes have been submitted for ordered Placement at 
          the Remote Peer. This means that when the ULP sees the 
          Delivery of the Send, the memory buffers targeted by any 
          preceding RDMA Writes and Sends are available to be accessed 
          locally or remotely as authorized. If the ULP overlaps its 
          buffers for different operations, the data from the RDMA Write 
          or Send may be overwritten by subsequent RDMA Operations 
          before the ULP receives and processes the Delivery. 

    
    
                           Expires January, 2007               [Page 48] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      11. RDMA Read Response Messages MUST be Delivered to the ULP at 
          the Remote Peer after they are Delivered to RDMAP by DDP and 
          in the order that the they were Delivered to RDMAP.  

          DDP ordering rules ensure that this will be the same order 
          that they were submitted at the Local Peer. This means that 
          when the ULP sees the Delivery of the RDMA Read Response, the 
          memory buffers targeted by the RDMA Read Response are 
          available to be accessed locally or remotely as authorized. If 
          the ULP overlaps its buffers for different operations, the 
          data from the RDMA Read Response may be overwritten by 
          subsequent RDMA Operations before the ULP receives and 
          processes the Delivery. 

      12. RDMA Read Request Messages, including zero-length RDMA Read 
          Requests, MUST NOT start processing at the Remote Peer until 
          they have been Delivered to RDMAP by DDP.  

          Note: the ULP is assured that data written can be read back. 
          For example, if an RDMA Read Request is issued by the local 
          peer, targeting the same ULP Buffer as a preceding Send or 
          RDMA Write (in the same direction as the RDMA Read Request), 
          and there are no other sources of update for the ULP Buffer, 
          then the remote peer will send back the data written by the 
          Send or RDMA Write. That is, for this example the ULP Buffer: 
          is Advertised for use on a series of RDMA Messages, is only 
          valid on the RDMAP Stream for which it is advertised, and is 
          not locally updated while the series of RDMAP Messages are 
          performed. For this example, order rule (12) assures that 
          subsequent local or remote accesses to the ULP Buffer contain 
          the data written by the Send or RDMA Write.  

          RDMA Read Response Messages MAY be generated at the Remote 
          Peer after subsequent RDMA Write Messages or Send Messages 
          have been Placed or Delivered. Therefore, when an application 
          does an RDMA Read Request followed by an RDMA Write (or Send) 
          to the same buffer, it may get the data from the later RDMA 
          Write (or Send) in the RDMA Read Response Message, even though 
          the operations completed in order at the Local Peer.  If this 
          behavior is not desired, the Local Peer ULP must Fence the 
          later RDMA write (or Send) by withholding the RDMA Write 
          Message until all outstanding RDMA Read Responses have been 
          Delivered. 
    
    
                           Expires January, 2007               [Page 49] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      13. The RDMAP Layer MUST submit RDMA Messages to the DDP layer in 
          the order the RDMA Operations are submitted to the RDMAP Layer 
          by the ULP. 

      14. A Send or RDMA Write Message MUST NOT be considered Complete 
          at the Local Peer (Data Source) until it has been successfully 
          completed at the DDP layer. 

      15. RDMA Operations MUST be Completed at the Local Peer in the 
          order that they were submitted by the ULP. 

      16. At the Data Sink, an incoming Send Message MUST be Delivered 
          to the ULP only after the DDP Message has been Delivered to 
          the RDMAP Layer by the DDP layer. 

      17. RDMA Read Response Message processing at the Remote Peer 
          (reading the specified Tagged Buffer) MUST be started only 
          after the RDMA Read Request Message has been Delivered by the 
          DDP layer (thus all previous RDMA Messages have been properly 
          submitted for ordered Placement). 

      18. Send Messages MAY be Completed at the Remote Peer (Data Sink) 
          before prior incoming RDMA Read Request Messages have 
          completed their response processing. 

      19. An RDMA Read operation MUST NOT be Completed at the Local Peer 
          until the DDP layer Delivers the associated incoming RDMA Read 
          Response Message. 

      20. If more than one outstanding RDMA Read Request Messages are 
          supported by both peers, the RDMA Read Response Messages MUST 
          be submitted to the DDP layer on the Remote Peer in the order 
          the RDMA Read Request Messages were Delivered by DDP, but the 
          actual read of the buffer contents MAY take place in any order 
          at the Remote Peer.  

           This simplifies Local Peer Completion processing for RDMA 
           Reads in that a Delivered RDMA Read Response MUST be 
           sufficient to Complete the RDMA Read Operation. 

    
    
                           Expires January, 2007               [Page 50] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    6  RDMAP Stream Management 

      RDMAP Stream management consists of RDMAP Stream Initialization 
      and RDMAP Stream Termination. 

   6.1  Stream Initialization 

      RDMAP Stream initialization occurs after the LLP Stream has been 
      created (e.g., for DDP/MPA over TCP the first TCP Segment after 
      the SYN, SYN/ACK exchange). The ULP is responsible for 
      transitioning the LLP Stream into RDMA enabled mode. The switch to 
      RDMA mode typically occurs sometime after LLP Stream setup. Once 
      in RDMA enabled mode, an implementation MUST send only RDMA 
      Messages across the transport Stream until the RDMAP Stream is 
      torn down.  

      For each direction of an RDMAP Stream: 

      *  For a given RDMAP Stream, the number of outstanding RDMA Read 
         Requests is limited per RDMAP Stream direction. 

      *  It is the ULP's responsibility to set the maximum number of 
         outstanding, inbound RDMA Read Requests per RDMAP Stream 
         direction.  

      *  The RDMAP Layer MUST provide the maximum number of outstanding, 
         inbound RDMA Read Requests per RDMAP Stream direction that were 
         negotiated between the ULP and the Local Peer's RDMAP Layer. 
         The negotiation mechanism is outside the scope of this 
         specification. 

      *  It is the ULP's responsibility to set the maximum number of 
         outstanding, outbound RDMA Read Requests per RDMAP Stream 
         direction. 

      *  The RDMAP Layer MUST provide the maximum number of outstanding, 
         outbound RDMA Read Requests for the RDMAP Stream direction that 
         were negotiated between the ULP and the Local Peer's RDMAP 
         Layer. The negotiation mechanism is outside the scope of this 
         specification. 

      *  The Local Peer's ULP is responsible for negotiating with the 
         Remote Peer's ULP the maximum number of outstanding RDMA Read 
    
    
                           Expires January, 2007               [Page 51] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
         Requests for the RDMAP Stream direction. It is recommended that 
         the ULP set the maximum number of outstanding, inbound RDMA 
         Read Requests equal to the maximum number of outstanding, 
         outbound RDMA Read Requests for a given RDMAP Stream direction. 

      *  For outbound RDMA Read Requests, the RDMAP Layer MUST NOT 
         exceed the maximum number of outstanding, outbound RDMA Read 
         Requests that were negotiated between the ULP and the Local 
         Peer's RDMAP Layer. 

      *  For inbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed 
         the maximum number of outstanding, inbound RDMA Read Requests 
         that were negotiated between the ULP and the Local Peer's RDMAP 
         Layer. 

       

   6.2  Stream Teardown 

      There are three methods for terminating an RDMAP Stream: ULP 
      Graceful Termination, RDMAP Abortive Termination, and LLP Abortive 
      Termination.  

      The ULP is responsible for performing ULP Graceful Termination. 
      After a ULP Graceful Termination, either side of the Stream can 
      initiate LLP Graceful Termination, using the graceful termination 
      mechanism provided by the LLP. 

      RDMAP Abortive Termination allows the RDMAP to issue a Terminate 
      Message describing the reason the RDMAP Stream was terminated. The 
      next section (6.2.1 RDMAP Abortive Termination) describes the 
      RDMAP Abortive Termination in detail. 

      LLP Abortive Termination results due to a LLP error and causes the 
      RDMAP Stream to be torn down midstream, without an RDMAP Terminate 
      Message.  While this last method is highly undesirable, it is 
      possible and the ULP should take this into consideration.  

   6.2.1  RDMAP Abortive Termination 

      RDMAP defines a Terminate operation that SHOULD be invoked when 
      either an RDMAP error is encountered or a LLP error is surfaced to 
      the RDMAP layer by the LLP.  
    
    
                           Expires January, 2007               [Page 52] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      It is not always possible to send the Terminate Message. For 
      example, certain LLP errors may occur that cause the LLP Stream to 
      be torn down before a) RDMAP is aware of the error, b) before 
      RDMAP is able to send the Terminate Message, or c) after RDMAP has 
      posted the Terminate Message to the LLP, but it has not yet been 
      transmitted by the LLP.  

      Note that an RDMAP Abortive Termination may entail loss of data. 
      In general, when a Terminate Message is received it is impossible 
      to tell for sure what unacknowledged RDMA Messages were Completed 
      successfully at the Remote Peer. Thus the state of all outstanding 
      RDMA Messages is indeterminate and the Messages SHOULD be 
      considered Completed in error. 

      When a peer sends or receives a Terminate Message, it MAY 
      immediately teardown the LLP Stream. The peer SHOULD perform a 
      graceful LLP teardown to ensure the Terminate Message is 
      successfully Delivered. 

      See section 4.8 Terminate Header for a description of the 
      Terminate Message and its contents. See section 5.4 Terminate 
      Message for a description of the Terminate Message semantics. 

    
    
                           Expires January, 2007               [Page 53] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    7  RDMAP Error Management 

      The RDMAP protocol does not have RDMAP or DDP layer error recovery 
      operations built in.  If everything is working, the LLP guarantees 
      will ensure that the Messages are arriving at the destination. 

      If errors are detected at the RDMAP or DDP layer, then the RDMAP, 
      DDP and LLP Streams are Abortively Terminated (see section 4.8 
      Terminate Header on page 34). 

      In general poor implementations or improper ULP programming causes 
      the errors detected at the RDMAP and DDP layers.  In these cases, 
      returning a diagnostic termination error Message and closing the 
      RDMAP Stream is far simpler than attempting to maintain the RDMAP 
      Stream, particularly when the cause of the error is not known.  

      If an LLP does not support teardown of a Stream independent of 
      other Streams and an RDMAP error results in the Termination of a 
      specific Stream, then the LLP MUST label the Stream as an 
      erroneous Stream and MUST NOT allow any further data transfer on 
      that Stream after RDMAP requests the Stream to be torn down. 

      For a specific LLP connection, when all Streams are either 
      gracefully torn down or are labeled as erroneous Streams, the LLP 
      connection MUST be torn down. 

      Since errors are detected at the Remote Peer (possibly long) after 
      RDMA Messages are passed to DDP and the LLP at the Local Peer and 
      Completed, the sender cannot easily determine which of its 
      Messages have been received. (RDMA Reads are an exception to this 
      rule). 

      For a list of errors returned to the Remote Peer as a result of an 
      Abortive Termination, see section 4.8 Terminate Header on page 34. 

   7.1  RDMAP Error Surfacing 

      If an error occurs at the Local Peer, the RDMAP layer MUST attempt 
      to inform the local ULP that the error has occurred. 

      The Local Peer MUST send a Terminate Message for each of the 
      following cases: 

    
    
                           Expires January, 2007               [Page 54] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      1.  For errors detected while creating RDMA Write, Send, Send with 
          Invalidate, Send with Solicited Event, Send with Solicited 
          Event and Invalidate, or RDMA Read Requests, or other reasons 
          not directly associated with an incoming Message, the 
          Terminate Message and Error code are sent instead of the 
          request.  In this case, the Error Type and Error Code fields 
          are included in the Terminate Message, but the Terminated DDP 
          Header and Terminated RDMA Header fields are set to zero. 

      2.  For errors detected on an incoming RDMA Write, Send, Send with 
          Invalidate, Send with Solicited Event, Send with Solicited 
          Event and Invalidate, or Read Response Message (after the 
          Message has been Delivered by DDP), the Terminate Message is 
          sent at the earliest possible opportunity, preferably in the 
          next outgoing RDMA Message. In this case, the Error Type, 
          Error Code, ULP PDU Length, and Terminated DDP Header fields 
          are included in the Terminate Message, but the Terminated RDMA 
          Header field is set to zero. 

      3.  For errors detected on an incoming RDMA Read Request Message 
          (after the Message has been Delivered by DDP), the Terminate 
          Message is sent at the earliest possible opportunity, 
          preferably in the next outgoing RDMA Message. In this case, 
          the Error Type, Error Code, ULP PDU Length, Terminated DDP 
          Header, and Terminated RDMA Header fields are included in the 
          Terminate Message. 

      4.  If more than one error is detected on incoming RDMA Messages, 
          before the Terminate Message can be sent, then the first RDMA 
          Message (and its associated DDP Segment) that experienced an 
          error MUST be captured by the Terminate Message in accordance 
          with rules 2 and 3 above. 

   7.2  Errors Detected at the Remote Peer on Incoming RDMA Messages 

      On incoming RDMA Writes, RDMA Read Response, Sends, Send with 
      Invalidate, Send with Solicited Event, Send with Solicited Event 
      and Invalidate, and Terminate Messages, the following must be 
      validated: 

      1.  The DDP Layer MUST validate all DDP Segment fields. 

      2.  The RDMA OpCode MUST be valid. 
    
    
                           Expires January, 2007               [Page 55] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      3.  The RDMA Version MUST be valid. 

          Additionally, on incoming Send with Invalidate and Send with 
          Solicited Event and Invalidate Messages, the following must 
          also be validated: 

      4.  The Invalidate STag MUST be valid. 

      5.  The STag MUST be associated to this RDMAP Stream. 

      On incoming RDMA Request Messages, the following must be 
      validated: 

      1.  The DDP Layer MUST validate all Untagged DDP Segment fields. 

      2.  The RDMA OpCode MUST be valid. 

      3.  The RDMA Version MUST be valid. 

      4.  For non-zero length RDMA Read Request Messages: 

          a.  The Data Source STag MUST be valid. 

          b.  The Data Source STag MUST be associated to this RDMAP 
              Stream. 

          c.  The Data Source Tagged Offset MUST fall in the range of 
              legal offsets associated with the Data Source STag. 

          d.  The sum of the Data Source Tagged Offset and the RDMA Read 
              Message Size MUST fall in the range of legal offsets 
              associated with the Data Source STag. 

          e.  The sum of the Data Source Tagged Offset and the RDMA Read 
              Message Size MUST NOT cause the Data Source Tagged Offset 
              to wrap. 

       

    
    
                           Expires January, 2007               [Page 56] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    8  Security Considerations 

      This section references the resources that discuss protocol-
      specific security considerations and implications of using RDMAP 
      with existing security services. A detailed analysis of the 
      security issues around implementation and use of the RDMAP can be 
      found in [RDMASEC]. 

      [RDMASEC] introduces the RDMA reference model and discusses how 
      the resources of this model are vulnerable to attacks and the 
      types of attack these vulnerabilities are subject to. It also 
      details the levels of Trust available in this peer-to-peer model 
      and how this defines the nature of resource sharing.  

      The IPsec requirements for RDDP are based on the version of IPsec 
      specified in RFC 2401 [RFC 2401] and related RFCs, as profiled by 
      RFC 3723 [RFC 3723], despite the existence of a newer version of 
      IPsec specified in RFC 4301 [RFC 4301] and related RFCs.  One of 
      the important early applications of the RDDP protocols is their 
      use with iSCSI [iSER]; RDDP's IPsec requirements follow those of 
      IPsec in order to facilitate that usage by allowing a common 
      profile of IPsec to be used with iSCSI and the RDDP protocols.  In 
      the future, RFC 3723 may be updated to the newer version of IPsec, 
      the IPsec security requirements of any such update should apply 
      uniformly to iSCSI and the RDDP protocols. 

       

   8.1  Summary of RDMAP specific Security Requirements 

      [RDMASEC] defines the security requirements for the implementation 
      of the components of the RDMA reference model, namely the RDMA 
      enabled NIC (RNIC) and the Privileged Resource Manager. An RDMAP 
      implementation conforming to this specification MUST conform to 
      these requirements.  

   8.1.1  RDMAP (RNIC) Requirements 

      RDMAP provides several countermeasures for all types of attacks as 
      introduced in [RDMASEC]. In the following, this specification 
      lists all security requirements which MUST be implemented by the 
      RNIC. A more detailed discussion of RNIC security requirements can 
      be found in Section 5 of [RDMASEC].  
    
    
                           Expires January, 2007               [Page 57] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
       

      1.  An RNIC MUST ensure that a specific Stream in a specific 
          Protection Domain cannot access an STag in a different 
          Protection Domain.  

      2.  An RNIC MUST ensure that if an STag is limited in scope to a 
          single Stream, no other Stream can use the STag.  

      3.  An RNIC MUST ensure that a Remote Peer is not able to access 
          memory outside of the buffer specified when the STag was 
          enabled for remote access.  

      4.  An RNIC MUST provide a mechanism for the ULP to establish and 
          revoke the association of a ULP Buffer to an STag and TO 
          range.  

      5.  An RNIC MUST provide a mechanism for the ULP to establish and 
          revoke read, write, or read and write access to the ULP Buffer 
          referenced by an STag.  

    
      6.  An RNIC MUST ensure that the network interface can no longer 
          modify an advertised buffer after the ULP revokes remote 
          access rights for an STag.  

    
      7.  An RNIC MUST ensure that a Remote Peer is not able to 
          invalidate an STag enabled for remote access, if the STag is 
          shared on multiple streams.  

    
      8.  An RNIC MUST choose the value of STags in a way difficult to 
          predict. It is RECOMMENDED to sparsely populate them over the 
          full available range.   

    
      9.  An RNIC MUST NOT enable sharing a CQ across ULPs that do not 
          share partial mutual trust.  

    
      10. An RNIC MUST ensure that if a CQ overflows, any Streams which 
          do not use the CQ MUST remain unaffected.  
    
    
                           Expires January, 2007               [Page 58] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    
      11. An RNIC implementation SHOULD provide a mechanism to cap the 
          number of outstanding RDMA Read Requests.  

    
      12. An RNIC MUST NOT enable firmware to be loaded on the RNIC 
          directly from an untrusted Local Peer or Remote Peer, unless 
          the Peer is properly authenticated (by a mechanism outside the 
          scope of this specification. The mechanism presumably entails 
          authenticating that the remote ULP has the right to perform 
          the update), and the update is done via a secure protocol, 
          such as IPsec. 

    
   8.1.2  Privileged Resource Manager Requirements 

      With RDMAP, all reservations of local resources are initiated from      
      local ULPs. To protect from local attacks including unfair      
      resource distribution and gaining unauthorized access to RNIC      
      resources, a Privileged Resource Manager (PRM) must be      
      implemented, which manages all local resource allocation. Note      
      that the PRM must not be provided as an independent component, its      
      functionality can also be implemented as part of the privileged      
      ULP or as part of the RNIC itself.   

      An PRM implementation must meet the following security      
      requirements (a more detailed discussion of PRM security 
      requirements can be found in Section 5 of [RDMASEC]): 

      1.  All Non-Privileged ULP interactions with the RNIC Engine that 
          could affect other ULPs MUST be done using the Resource 
          Manager as a proxy.  

      2.  All ULP resource allocation requests for scarce resources MUST 
          also be done using a Privileged Resource Manager.  

      3.  The Privileged Resource Manager MUST NOT assume different ULPs 
          share Partial Mutual Trust unless there is a mechanism to 
          ensure that the ULPs do indeed share partial mutual trust.  

      4.  If Non-Privileged ULPs are supported, the Privileged Resource 
          Manager MUST verify that the Non-Privileged ULP has the right 
          to access a specific Data Buffer before allowing an STag for 
    
    
                           Expires January, 2007               [Page 59] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          which the ULP has access rights to be associated with a 
          specific Data Buffer.  

      5.  The Privileged Resource Manager MUST control the allocation of 
          CQ entries.  

      6.  The Privileged Resource Manager SHOULD prevent a Local Peer 
          from allocating more than its fair share of resources.  

      7.  RDMA Read Request Queue resource consumption MUST be 
          controlled by the Privileged Resource Manager such that 
          RDMAP/DDP Streams which do not share Partial Mutual Trust do 
          not share RDMA Read Request Queue resources.  

      8.  If an RNIC provides the ability to share receive buffers 
          across multiple Streams, the combination of the RNIC and the 
          Privileged Resource Manager MUST be able to detect if the 
          Remote Peer is attempting to consume more than its fair share 
          of resources so that the Local Peer can apply countermeasures 
          to detect and prevent the attack. 

    
   8.2  Security Services for RDMAP  

       RDMAP is using IP based network services to control, read and 
       write data buffers over the network. Therefore, all exchanged 
       control and data packets are vulnerable to spoofing, tampering 
       and information disclosure attacks.  

       RDMAP Streams that are subject to impersonation attacks, or 
      Stream hijacking attacks, can be authenticated, have their 
      integrity protected, and be protected from replay attacks. 
      Furthermore, confidentiality protection can be used to protect 
      from eavesdropping.  

    
   8.2.1  Available Security Services  

    
      The IPsec protocol suite [RFC2401] defines strong countermeasures 
      to protect an IP stream from those attacks. Several levels of 
      protection can guarantee session confidentiality, per-packet 

    
    
                           Expires January, 2007               [Page 60] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      source authentication, per-packet integrity and correct packet 
      sequencing.  

       

      RDMAP security may also profit from SSL or TLS security services 
      provided for TCP based ULPs [RFC4346]. Used underneath RDMAP, 
      these security services also provides for stream authentication, 
      data integrity and confidentiality. As discussed in [RDMASEC], 
      limitations on the maximum packet length to be carried over the 
      network and potentially inefficient out-of-order packet processing 
      at the data sink makes SSL and TLS less appropriate for RDMAP than 
      IPsec.  

      If SSL is layered on top of RDMAP, SSL does not protect the RDMAP 
      headers. Thus, a man-in-the-middle attack can still occur by 
      modifying the RDMAP header to incorrectly place the data into the 
      wrong buffer, thus effectively corrupting the data stream.  

      By remaining independent of ULP and LLP security protocols, RDMAP 
      will benefit from continuing improvements at those layers. Users 
      are provided flexibility to adapt to their specific security 
      requirements and the ability to adapt to future security 
      challenges. Given this, the vulnerabilities of RDMAP to active 
      third-party interference are no greater than any other protocol 
      running over an LLP such as TCP or SCTP.  

    
   8.2.2  Requirements for IPsec Services for RDMAP  

    
      Because IPsec is designed to secure arbitrary IP packet streams, 
      including streams where packets are lost, RDMAP can run on top of 
      IPsec without any change. IPsec packets are processed (e.g., 
      integrity checked and possibly decrypted) in the order they are 
      received, and an RDMAP Data Sink will process the decrypted RDMA 
      Messages contained in these packets in the same manner as RDMA 
      Messages contained in unsecured IP packets.  

    

       The IP Storage working group has defined the normative IPsec 
       requirements for IP Storage [RFC3723]. Portions of this 
    
    
                           Expires January, 2007               [Page 61] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
       specification are applicable to the RDMAP. In particular, a 
       compliant implementation of IPsec services for RDMAP MUST meet 
       the requirements as outlined in Section 2.3 of [RFC3723]. Without 
       replicating the detailed discussion in [RFC3723], this includes 
       the following requirements: 

    
      1.  The implementation MUST support IPsec ESP [RFC2406], as well 
          as the replay protection mechanisms of IPsec. When ESP is 
          utilized, per-packet data origin authentication, integrity and 
          replay protection MUST be used.  

      2.  It MUST support ESP in tunnel mode and MAY implement ESP in 
          transport mode.  

      3.  It MUST support IKE [RFC2409] for peer authentication, 
          negotiation of security associations, and key management, 
          using the IPsec DOI [RFC2407].   

      4.  It MUST NOT interpret the receipt of a IKE Phase 2 delete 
          message as a reason for tearing down the RDMAP stream. Since 
          IPsec acceleration hardware may only be able to handle a 
          limited number of active IKE Phase 2 SAs, idle SAs may be 
          dynamically brought down and a new SA be brought up again, if 
          activity resumes.   

      5.  It MUST support peer authentication using a pre-shared key, 
          and MAY support certificate-based peer authentication using 
          digital signatures. Peer authentication using the public key  
          encryption methods [RFC2409] SHOULD NOT be used.  

      6.  It MUST support IKE Main Mode and SHOULD support Aggressive 
          Mode. IKE Main Mode with pre-shared key authentication SHOULD 
          NOT be used when either of the peers uses a dynamically 
          assigned IP address.  

      7.  When digital signatures are used to achieve authentication, 
          either IKE Main Mode or IKE Aggressive Mode MAY be used. In 
          these cases, an IKE negotiator SHOULD use IKE Certificate 
          Request Payload(s) to specify the certificate authority (or 
          authorities) that are trusted in accordance with its local 
          policy. IKE negotiators SHOULD check the pertinent Certificate 
          Revocation List (CRL) before accepting a PKI certificate for 
    
    
                           Expires January, 2007               [Page 62] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
          use in IKE's authentication procedures. 
           

      8.  Access to locally stored secret information (pre-shared or 
          private key for digital signing) must be suitably restricted, 
          since compromise of the secret information nullifies the 
          security properties of the IKE/IPsec protocols.  

      9.  It MUST follow the guidelines of Section 2.3.4 of [RFC3723] on 
          the setting of IKE parameters to achieve a high level of 
          interoperability without requiring extensive configuration. 

       Furthermore, implementation and deployment of the IPsec services 
      for RDDP should follow the Security Considerations outlined in 
      Section 5 of [RFC3723].  

    
    
                           Expires January, 2007               [Page 63] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    9  IANA 

      IANA Considerations 

      This document requests no direct action from IANA.  The following 
      consideration is listed here as commentary. 

      If RDMAP was enabled a priori for a ULP by connecting to a well-
      known port, this well-known port would be registered for the RDMAP 
      with IANA. The registration of the well-known port will be the 
      responsibility of the ULP specification. 

    
    
                           Expires January, 2007               [Page 64] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    10 References 

   10.1 Normative References 

      [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate    
          Requirement Levels", BCP 14, RFC 2119, March 1997. 

      [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security 
          Payload (ESP)", RFC 2406, November 1998. 

      [RFC2407] Piper, D., "The Internet IP Security Domain of 
          Interpretation of ISAKMP", RFC 2407, November 1998. 

      [RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange 
          (IKE)", RFC 2409, November 1998. 

      [RFC3723] Aboba B. et al., "Secure Block Storage Protocols over 
          IP", RFC 3723, April 2004. 
           

      [RFC 4301] S. Kent and K. Seo, "Security Architecture for the 
          Internet Protocol", RFC 4301, December 2005. 

      [VERBS] J. Hilland, "RDMA Protocol Verbs Specification", draft-
          hilland-iwarp-verbs-v1.0 RDMA Consortium, April 2003.  

      [DDP] H. Shah et al., "Direct Data Placement over Reliable 
          Transports", draft-ietf-rddp-ddp-07.txt, September 2006. 

      [MPA] P. Culley et al., "Marker PDU Aligned Framing for TCP 
          Specification", draft-ietf-rddp-mpa-06.txt, September 2006. 

      [SCTP] R. Stewart et al., "Stream Control Transmission Protocol", 
          RFC 2960, October 2000. 

      [TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 
          September 1981. 

      [RDMASEC]  J. Pinkerton et al., "DDP/RDMAP Security", draft-ietf-
          rddp-security-09.txt, March 2005. 

    
    
                           Expires January, 2007               [Page 65] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      [iSER] M. Ko, et. al., "iSCSI Extensions for RDMA Specification, 
          "Internet-Draft, draft-ietf-ips-iser-05.txt, Work in Progress, 
          October 2005. 

   10.2 Informative References 

      [RFC2401]  Atkinson, R., Kent, S., "Security Architecture for the 
          Internet Protocol", RFC 2401, November 1998. 

      [RFC4346] Dierks, T. and C. Allen, "The TLS Protocol Version 1.1", 
          RFC 4346, April 2006. 

       

       

    
    
                           Expires January, 2007               [Page 66] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
    11 Appendix 

   11.1 DDP Segment Formats for RDMA Messages 

      This appendix is for information only and is NOT part of the 
      standard. It simply depicts the DDP Segment format for the various 
      RDMA Messages. 

   11.1.1 DDP Segment for RDMA Write 

      The following figure depicts an RDMA Write, DDP Segment:  

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                       |   DDP Control | RDMA Control  | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                       Data Sink STag                          | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                   Data Sink Tagged Offset                     | 
       +                                                               + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                   RDMA Write ULP Payload                      | 
       //                                                             // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 11 RDMA Write, DDP Segment format 

   11.1.2 DDP Segment for RDMA Read Request 

      The following figure depicts an RDMA Read Request, DDP Segment: 

    
    
                           Expires January, 2007               [Page 67] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                       |  DDP Control  | RDMA Control  | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                      Reserved (Not Used)                      | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |              DDP (RDMA Read Request) Queue Number             | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |        DDP (RDMA Read Request) Message Sequence Number        | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |             DDP (RDMA Read Request) Message Offset            | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                     Data Sink STag (SinkSTag)                 | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       +                  Data Sink Tagged Offset (SinkTO)             + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                  RDMA Read Message Size (RDMARDSZ)            | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                     Data Source STag (SrcSTag)                | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       +                 Data Source Tagged Offset (SrcTO)             + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  
      Figure 12 RDMA Read Request, DDP Segment format 

       

       

       

       

       

       

    
    
                           Expires January, 2007               [Page 68] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   11.1.3 DDP Segment for RDMA Read Response 

      The following figure depicts an RDMA Read Response, DDP Segment:  

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                       |  DDP Control  | RDMA Control  | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                       Data Sink STag                          | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                   Data Sink Tagged Offset                     | 
       +                                                               + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                RDMA Read Response ULP Payload                 | 
       //                                                             // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 13 RDMA Read Response, DDP Segment format 

   11.1.4 DDP Segment for Send and Send with Solicited Event 

      The following figure depicts a Send and Send with Solicited 
      Request, DDP Segment: 

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                       |  DDP Control  | RDMA Control  | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                      Reserved (Not Used)                      | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                       (Send) Queue Number                     | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                 (Send) Message Sequence Number                | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                      (Send) Message Offset                    | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                       Send ULP Payload                        | 
       //                                                             // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
    
    
                           Expires January, 2007               [Page 69] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
      Figure 14 Send and Send with Solicited Event, DDP Segment format 

   11.1.5 DDP Segment for Send with Invalidate and Send with SE and 
          Invalidate 

      The following figure depicts a Send with invalidate and Send with 
      Solicited and Invalidate Request, DDP Segment: 

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                       |   DDP Control | RDMA Control  | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                         Invalidate STag                       | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                       (Send) Queue Number                     | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                 (Send) Message Sequence Number                | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                      (Send) Message Offset                    | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                       Send ULP Payload                        | 
       //                                                             // 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 15 Send with Invalidate and Send with SE and Invalidate, 
      DDP Segment 

       

       

       

       

       

       

       

    
    
                           Expires January, 2007               [Page 70] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   11.1.6 DDP Segment for Terminate 

      The following figure depicts a Terminate, DDP Segment: 

        0                   1                   2                   3    
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1  
                                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
                                       |   DDP Control | RDMA Control  | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                      Reserved (Not Used)                      | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                   DDP (Terminate) Queue Number                | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |             DDP (Terminate) Message Sequence Number           | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                  DDP (Terminate) Message Offset               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |       Terminate Control             |      Reserved           | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |  DDP Segment Length (if any)  |                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               + 
       |                                                               | 
       +                                                               + 
       |                 Terminated DDP Header (if any)                | 
       +                                                               + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
       |                                                               | 
       //                                                             // 
       |                 Terminated RDMA Header (if any)               | 
       +                                                               + 
       |                                                               | 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
      Figure 16 Terminate, DDP Segment format 

   11.2 Ordering and Completion Table 

      The following table summarizes the ordering relationships that are 
      defined in section 5.5 Ordering and Completions from the 
      standpoint of the local peer issuing the two Operations. Note, in 
      the table that follows Send includes Send, Send with Invalidate, 
      Send with Solicited Event, and Send with Solicited Event and 
      Invalidate 
    
    
                           Expires January, 2007               [Page 71] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
   ------+-------+----------------+----------------+---------------- 
   First | Later | Placement      | Placement      | Ordering 
    Op   | Op    | guarantee at   | guarantee      | guarantee at 
         |       | Remote Peer    | Local Peer     | Remote Peer 
         |       |                |                |  
   ------+-------+----------------+----------------+---------------- 
   Send  | Send  | No placement   | Not applicable | Completed in 
         |       | guarantee. If  |                | order. 
         |       | guarantee is   |                |  
         |       | necessary, see |                |  
         |       | footnote 1.    |                |  
   ------+-------+----------------+----------------+---------------- 
   Send  | RDMA  | No placement   | Not applicable | Not applicable 
         | Write | guarantee. If  |                |  
         |       | guarantee is   |                |  
         |       | necessary, see |                |  
         |       | footnote 1.    |                |  
   ------+-------+----------------+----------------+---------------- 
   Send  | RDMA  | No placement   | RDMA Read      | RDMA Read 
         | Read  | guarantee      | Response       | Response 
         |       | between Send   | Payload will   | Message will 
         |       | Payload and    | not be placed  | not be 
         |       | RDMA Read      | at the local   | generated until 
         |       | Request Header | peer until the | Send has been 
         |       |                | Send Payload is| Completed 
         |       |                | placed at the  |  
         |       |                | remote peer    |  
   ------+-------+----------------+----------------+---------------- 
   RDMA  | Send  | No placement   | Not applicable | Not applicable 
   Write |       | guarantee. If  |                |  
         |       | guarantee is   |                |  
         |       | necessary, see |                |  
         |       | footnote 1.    |                |  
   ------+-------+----------------+----------------+---------------- 
   RDMA  | RDMA  | No placement   | Not applicable | Not applicable 
   Write | Write | guarantee. If  |                |  
         |       | guarantee is   |                |  
         |       | necessary, see |                |  
         |       | footnote 1.    |                |  
   ------+-------+----------------+----------------+---------------- 
   RDMA  | RDMA  | No placement   | RDMA Read      | Not applicable 
   Write | Read  | guarantee      | Response       |  
         |       | between RDMA   | Payload will   |  
    
    
                           Expires January, 2007               [Page 72] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
         |       | Write Payload  | not be placed  |  
         |       | and RDMA Read  | at the local   |  
         |       | Request Header | peer until the |  
         |       |                | RDMA Write     |  
         |       |                | Payload is     |  
         |       |                | placed at the  |  
         |       |                | remote peer    |  
   ------+-------+----------------+----------------+---------------- 
   RDMA  | Send  | No placement   | Send Payload   | Not applicable 
   Read  |       | guarantee      | may be placed  |  
         |       | between RDMA   | at the remote  |  
         |       | Read Request   | peer before the|  
         |       | Header and Send| RDMA Read      |  
         |       | payload        | Response is    |  
         |       |                | generated.     |  
         |       |                | If guarantee is|  
         |       |                | necessary, see |  
         |       |                | footnote 2.    |  
   ------+-------+----------------+----------------+---------------- 
   RDMA  | RDMA  | No placement   | RDMA Write     | Not applicable 
   Read  | Write | guarantee      | Payload may be |  
         |       | between RDMA   | placed at the  |  
         |       | Read Request   | remote peer    |  
         |       | Header and RDMA| before the RDMA|  
         |       | Write payload  | Read Response  |  
         |       |                | is generated.  |  
         |       |                | If guarantee is|  
         |       |                | necessary, see |  
         |       |                | footnote 2.    |  
   ------+-------+----------------+----------------+---------------- 
   RDMA  | RDMA  | No placement   | No placement   | Second RDMA 
   Read  | Read  | guarantee of   | guarantee of   | Read Response 
         |       | the two RDMA   | the two RDMA   | will not be 
         |       | Read Request   | Read Response  | generated until 
         |       | Headers        | Payloads.      | first RDMA Read 
         |       | Additionally,  |                | Response is 
         |       | there is no    |                | generated. 
         |       | guarantee that |                |  
         |       | the Tagged     |                |  
         |       | Buffers        |                |  
         |       | referenced in  |                |  
         |       | the RDMA Read  |                |  
         |       | will be read in|                |  
    
    
                           Expires January, 2007               [Page 73] 
    

   Internet-Draft        RDMA Protocol Specification     September 2006 
    
         |       | order          |                | 
      Figure 17 Operation Ordering 

      Footnote 1:  If the guarantee is necessary, a ULP may insert an 
      RDMA Read Operation and wait for it to complete to act as a Fence. 

      Footnote 2:  If the guarantee is necessary, a ULP may wait for the 
      RDMA Read Operation to complete before performing the Send. 

    
    
                           Expires January, 2007               [Page 74] 
    

  
   12 Author's Address  

 Paul R. Culley 
 Hewlett-Packard Company 
 20555 SH 249                  
 Houston, Tx. USA 77070-2698 
 Phone:  281-514-5543 
 Email:  paul.culley@hp.com 
     

 Dave Garcia 
 Hewlett-Packard Company 
 19333 Vallco Parkway          
 Cupertino, Ca. USA 95014 
 Phone:  408.285.6116 
 Email:  dave.garcia@hp.com 
     

 Jeff Hilland 
 Hewlett-Packard Company 
 20555 SH 249                  
 Houston, Tx. USA 77070-2698 
 Phone:  281-514-9489 
 Email:  jeff.hilland@hp.com 
     

 Bernard Metzler 
 IBM Research GmbH 
 Zurich Research Laboratory 
 Saeumerstrasse 4 
 CH-8803 Rueschlikon, Switzerland 
 Phone: +41 44 724 8605 
 Email:  bmt@zurich.ibm.com 
  

 Renato J. Recio 
 IBM Corp. 
 11501 Burnett Road 
 Austin, Tx. USA 78758 
 Phone:  512-838-3685 
 Email:  recio@us.ibm.com 

  
  
                       Expires February, 2007               [Page 75] 


 Internet-Draft        RDMA Protocol Specification     September 2006 
  
   13 Contributors 

    Dwight Barron 
        Hewlett-Packard Company 
        20555 SH 249 
        Houston, Tx. USA 77070-2698 
        Phone:  281-514-2769 
        Email:  dwight.barron@hp.com 

    Caitlin Bestler 
        Broadcom Corporation 
        16215 Alton Parkway 
        Irvine, CA.  USA 92619-7013 
        Phone:  949-926-6383 
        Email:  caitlinb@broadcom.com 

    John Carrier 
        Cray, Inc. 
        411 First Avenue S, Suite 600 
        Seattle, WA 98104-2860 USA 
        Phone: 206-701-2090 
        Email: carrier@cray.com 

    Ted Compton 
        EMC Corporation 
        Research Triangle Park, NC 27709, USA 
        Phone: 919-248-6075 
        Email: compton_ted@emc.com 

    Uri Elzur  
        Broadcom Corporation 
        16215 Alton Parkway 
        Irvine, California 92619-7013 USA 
        Phone: +1 (949) 585-6432 
        Email: Uri@Broadcom.com 

    Hari Ghadia 
        Adaptec, Inc. 
        691 S. Milpitas Blvd., 
        Milpitas, CA 95035  USA 
        Phone: +1 (408) 957-5608 
        Email: hari_ghadia@adaptec.com 

  
  
                         Expires January, 2007               [Page 76] 
  

 Internet-Draft        RDMA Protocol Specification     September 2006 
  
    Howard C. Herbert 
        Intel Corporation 
        MS CH7-404 
        5000 West Chandler Blvd. 
        Chandler, Arizona 85226 
        Phone: 480-554-3116 
        Email: howard.c.herbert@intel.com 

    Mike Ko  
        IBM  
        650 Harry Rd.  
        San Jose, CA 95120  
        Phone: (408) 927-2085  
        Email: mako@us.ibm.com  

    Mike Krause  
        Hewlett-Packard Company 
        43LN 
        19410 Homestead Road  
        Cupertino, CA  95014 USA 
        Phone: 408-447-3191 
        Email: krause@cup.hp.com  

    Dave Minturn 
        Intel Corporation 
        MS JF1-210 
        5200 North East Elam Young Parkway 
        Hillsboro, Oregon  97124 
        Phone: 503-712-4106 
        Email: dave.b.minturn@intel.com 

    Mike Penna 
        Broadcom Corporation 
        16215 Alton Parkway 
        Irvine, California 92619-7013 USA 
        Phone: +1 (949) 926-7149 
        Email: MPenna@Broadcom.com 

    Jim Pinkerton 
        Microsoft, Inc. 
        One Microsoft Way 
        Redmond, WA, USA 98052 
        Email:  jpink@microsoft.com 
  
  
                         Expires January, 2007               [Page 77] 
  

 Internet-Draft        RDMA Protocol Specification     September 2006 
  
    Hemal Shah 
        Broadcom Corporation 
        16215 Alton Parkway 
        Irvine, CA. USA 92619-7013 
        Phone: 949-926-6941 
        Email:  

    Allyn Romanow 
        Cisco Systems 
        170 W Tasman Drive 
        San Jose, CA 95134 USA 
        Phone: +1 408 525 8836 
        Email: allyn@cisco.com 

    Tom Talpey 
        Network Appliance 
        1601 Trapelo Road #16 
        Waltham, MA 02451 USA 
        Phone: +1 (781) 768-5329 
        EMail: thomas.talpey@netapp.com 

    Patricia Thaler 
        Broadcom Corporation 
        16215 Alton Parkway  
        Irvine, CA. USA 92619-7013 
        Phone: +1-916-570-2707 
        email: pthaler@broadcom.com 

    Jim Wendt 
        Hewlett-Packard Company 
        8000 Foothills Boulevard MS 5668 
        Roseville, CA 95747-5668 USA 
        Phone: +1 916 785 5198 
        Email: jim_wendt@hp.com 

    Madeline Vega 
        IBM 
        11400 Burnet Rd. Bld.45-2L-007 
        Austin, TX.  USA 78758 
        Phone:  512-838-7739 
        Email:  mvega1@us.ibm.com 

  
  
                         Expires January, 2007               [Page 78] 
  

 Internet-Draft        RDMA Protocol Specification     September 2006 
  
    Claudia Salzberg 
        IBM 
        11501 Burnet Rd. Bld.902-5B-014 
        Austin, TX.  USA 78758 
        Phone:  512-838-5156 
        Email:  salzberg@us.ibm.com 

     

     

  
  
                         Expires January, 2007               [Page 79] 
  

 Internet-Draft        RDMA Protocol Specification     September 2006 
  
   14 Intellectual Property Statement 

    The IETF takes no position regarding the validity or scope of any 
    Intellectual Property Rights or other rights that might be claimed 
    to pertain to the implementation or use of the technology described 
    in this document or the extent to which any license under such 
    rights might or might not be available; nor does it represent that 
    it has made any independent effort to identify any such rights. 
    Information on the procedures with respect to rights in RFC 
    documents can be found in BCP 78 and BCP 79.  

    Copies of IPR disclosures made to the IETF Secretariat and any 
    assurances of licenses to be made available, or the result of an 
    attempt made to obtain a general license or permission for the use 
    of such proprietary rights by implementers or users of this 
    specification can be obtained from the IETF on-line IPR repository 
    at http://www.ietf.org/ipr.  

    The IETF invites any interested party to bring to its attention any 
    copyrights, patents or patent applications, or other proprietary 
    rights that may cover technology that may be required to implement 
    this standard. Please address the information to the IETF at ietf-
    ipr@ietf.org. 

  
  
                         Expires January, 2007               [Page 80] 
  

 Internet-Draft        RDMA Protocol Specification     September 2006 
  
   15 Full Copyright Statement 

    Copyright (C) The Internet Society (2006).  

    This document is subject to the rights, licenses and restrictions 
    contained in BCP 78, and except as set forth therein, the authors 
    retain all their rights. 

    This document and the information contained herein are provided on 
    an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 
    REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 
    THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 
    EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 
    THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 
    ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 
    PARTICULAR PURPOSE. 

     

  
  
                         Expires January, 2007               [Page 81]