Routing Over Large Clouds Working Group                 James V. Luciani
INTERNET-DRAFT                                            (Bay Networks)
<draft-ietf-ion-scsp-00.txt>                          Grenville Armitage
                                                              (Bellcore)
                                                            Joel Halpern
                                                             (Newbridge)
                                                       Expires June 1997





              Server Cache Synchronization Protocol (SCSP)


Status of this Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ds.internic.net (US East Coast), nic.nordu.net
   (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific
   Rim).

Abstract

   This document describes the Server Cache Synchronization Protocol
   (SCSP) for Non Broadcast Multiple Access (NBMA) networks.  SCSP
   attempts to solve the generalized server synchronization/cache-
   replication problem wherein a set of server entities which are bound
   to a Server Group (SG) through some means (e.g., all servers
   belonging to the same Logical IP Subnet (LIS)[1]) wish to synchronize
   the contents (or a portion thereof) of their caches.  These caches
   contain information on the state of the clients within the scope of
   interest of the SG.  An example of types of information that must be
   synchronized can be seen in NHRP using IP where the information
   includes the REGISTERED clients' IP to NBMA mappings in the SG LIS.





Luciani, et al.                                                 [Page 1]


INTERNET-DRAFT                    SCSP                 Expires June 1997


1. Introduction

   It is perhaps an obvious goal for any protocol to not limit itself to
   a single point of failure such as having a single server in a
   client/server paradigm.  Even when there are redundant servers, there
   still remains the problem of cache synchronization; i.e.,  when one
   server becomes aware of a change in state of cache information then
   that server must propagate the knowledge of the change in state to
   all servers which are actively mirroring that state information.
   Further, this must be done in a timely fashion without putting undo
   resource strains on the servers. Assuming that the state information
   kept in the server cache is the state of clients of the server, then
   in order to minimize the burden placed upon the client it is also
   highly desirable that clients need not have complete knowledge of all
   servers which they may use.  However, any mechanism for
   synchronization should not preclude a client from having access to
   several (or all) servers.  Of course, any solution must be reasonably
   scalable, capable of using some autoconfiguration service, and lend
   itself to a wide range of authentication methodologies

   This document describes the Server Cache Synchronization Protocol
   (SCSP). SCSP solves the generalized server synchronization/cache-
   replication problem while addressing the issues described above.
   SCSP synchronizes caches (or a portion of the caches) of a set of
   server entities which are bound to a Server Group (SG) through some
   means (e.g., all NHRP servers belonging to a Logical IP Subnet
   (LIS)[1]).  SGs are identified by an ID which, not surprisingly, is
   called a SGID.  Note therefore that a SGID identifies both the
   client/server protocol for which the servers of the SG are being
   synchronized as well as the instance of that protocol.  This implies
   that multiple instances of the same protocol may be in operation at
   the same time and have their servers synchronized independently of
   each other.  SGs may exist in any topology as long as the resultant
   graph spans the set of servers that need to be synchronized.  The
   caches which are to be synchronized contain information on the state
   of the clients within the scope of interest of the SG.  An example of
   types of information that must be synchronized can be seen in NHRP[2]
   using IP where the information includes the REGISTERED clients' IP to
   NBMA mappings in the SG LIS.

   Only the first few pages of this document constitute the SCSP
   description proper.  However, this document also includes a
   description of the use of SCSP by a number of protocols (e.g., NHRP,
   ATMARP, etc.) and some optional functionality which may be
   implemented as deemed appropriate.  It is hoped that these appendices
   will spark interest in applying SCSP to the server synchronization
   needs of other protocols by supplying examples of SCSP's use.




Luciani, et al.                                                 [Page 2]


INTERNET-DRAFT                    SCSP                 Expires June 1997


2. Overview

   SCSP places no topological requirements upon upon the SG.  Obviously,
   however, the resultant graph must span the set of servers to be
   synchronized.  SCSP borrows heavily from the link state protocols
   [3,4].  However, unlike those technologies, there is no Shortest Path
   First (SPF) calculation and there are little or no additional memory
   requirements imposed above and beyond that which is required to save
   the cached information which would exist regardless of the
   synchronization technology.

   In order to give a frame of reference for the following discussion,
   the terms Local Server (LS), Directly Connected Server (DCS), and
   Remote Server (RS) are introduced.  The LS is the server under
   scrutiny; i.e., all statements are made from the perspective of the
   LS when discussing the SCSP protocol. The DCS is a server which is
   directly connected to the LS;  e.g., there exists a VC between the LS
   and DCS.  Thus, every server is a DCS from the point of view of every
   other server which connects to it directly, and every server is an LS
   which has zero or more DCSs directly connected to it.  An RS is a
   server that is neither an LS nor a DCS; i.e, an RS is always two or
   more hops away from an LS (whereas a DCS is always one hop away from
   an LS).

   SCSP contains three sub protocols: the "Hello" protocol, the "Cache
   Alignment" protocol, and the "Client State Update" protocol.  The
   "Hello" protocol is used to ascertain whether a DCS is operational
   and whether the connection between the LS and DCS is bidirectional,
   unidirectional, or non-functional.  The "Cache Alignment" (CA)
   protocol allows an LS to synchronize its entire cache with that of
   the cache of its DCSs. The "Client State Update" (CSU) protocol is
   used to update the state of cache entries in servers for a given SG.
   Sections 2.1, 2.2, and 2.3 contain a more in depth explanation of the
   Hello, CA, and CSU protocols and the messages they use.

















Luciani, et al.                                                 [Page 3]


INTERNET-DRAFT                    SCSP                 Expires June 1997


                       +---------------+
                       |               |
              +-------@|     DOWN      |@-------+
              |        |               |        |
              |        +---------------+        |
              |            |       @            |
              |            |       |            |
              |            |       |            |
              |            |       |            |
              |            @       |            |
              |        +---------------+        |
              |        |               |        |
              |        |    WAITING    |        |
              |     +--|               |--+     |
              |     |  +---------------+  |     |
              |     |    @           @    |     |
              |     |    |           |    |     |
              |     @    |           |    @     |
            +---------------+     +---------------+
            |  BIDIRECTION  |----@|  UNIDIRECTION |
            |               |     |               |
            |  CONNECTION   |@----|  CONNECTION   |
            +---------------+     +---------------+


          Figure 1: Hello Finite State Machine (HFSM)


2.1  Hello Protocol


   "Hello" messages are used to ascertain whether a DCS is operational
   and whether the connections between the LS and DCS are bidirectional,
   unidirectional, or non-functional. In order to do this, every LS MUST
   periodically send a Hello message to its DCSs.

   An LS must be configured with a list of NBMA addresses. These NBMA
   addresses are the addresses of peer servers in a SG to which the LS
   wishes to have a direct connection for the purpose of running SCSP;
   that is, these addresses are the addresses of would-be DCSs.  The
   mechanism for the configuration of an LS with these NBMA address is
   beyond the scope of this document; although one possible mechanism
   would be an autoconfiguration server.

   An LS has a Hello Finite State Machine (HFSM) associated with each of
   its DCSs (see Figure 1) for a given SG, and the HFSM monitors the
   state of the connectivity between the servers.  Thus, for example, if
   there are two servers  (one in SG A and the other in SG B) associated



Luciani, et al.                                                 [Page 4]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   with an NBMA address X and another two servers (also one in SG A and
   the other in SG B) associated with NBMA address Y and there is a
   suitable point-to-point VC between the NBMA addresses then there are
   two HFSMs running on each side of the VC (one per SGID).

   The HFSM starts in the "Down" State and transitions to the "Waiting"
   State after NBMA level connectivity has been established.  Once in
   the Waiting State, the LS starts sending Hello messages to the DCS.
   The Hello message includes: a Sender ID (SID) which is set to the
   LS's ID (LSID), zero or more Receiver IDs which identify the DCSs
   from which the LS has heard a Hello message, and a HelloInterval and
   DeadFactor which will be described below.   At this point, the DCS
   may or may not already be sending its own Hello messages to the LS.

   When the LS receives a Hello message from one of its DCSs and the LS
   is in any state other than the Down state, the LS checks to see if
   its LSID is in one of the Receiver ID fields of that message which it
   just received, and the LS saves the SID from that Hello message. If
   the LSID is in one of the Receiver ID fields then the LS transitions
   the HFSM to the Bidirectional Connection state otherwise it
   transitions the HFSM into the Unidirectional Connection state.  The
   SID which was saved is the DCS's ID (DCSID).  The next time that the
   LS sends its own Hello message to the DCS, the LS will check the
   saved DCSID against a list of Receiver IDs which the LS uses when
   sending the LS's own Hello messages.  If the DCSID is not found in
   the list of Receiver IDs then it is added to that list before the LS
   sends its Hello message.

   Hello messages also contain a HelloInterval and a DeadFactor.  The
   Hello interval advertises the time between sending of consecutive
   Hello messages by a server.  That is, if the time between reception
   of Hello messages from a DCS exceeds the HelloInterval advertised by
   that DCS then the next Hello message is to be considered late by the
   LS.  If the LS does not receive a Hello message within the interval
   HelloInterval*DeadFactor seconds then the LS MUST consider the DCS to
   be stalled at which point the LS should transition the HFSM for that
   DCS to the Waiting State and remove the DCSID from the Receiver ID
   list.  Note that the Hello Protocol is on a per SG basis.

   Hello messages contain a list of Receiver IDs instead of a single
   Receiver ID in order to make use of point to multipoint connections.
   While there is an HFSM per DCS, an LS MUST send only a single Hello
   message to its DCSs attached as leaves of a point to multipoint
   connection.  The LS does this by including DCSIDs in the list of
   Receiver IDs when the LS's sends its next Hello message.  Only the
   DCSIDs from non-stalled DCSs from which the LS has heard a Hello
   message are included.




Luciani, et al.                                                 [Page 5]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Any abnormal event, such as receiving a malformed Hello message,
   causes the HFSM to transition to the Waiting State; however, a loss
   of NBMA connectivity causes the HFSM to transition to the Down State.


                   +------------+
                   |            |
              +---@|    DOWN    |
              |    |            |
              |    +------------+
              |          |
              |          |
              |          @
              |    +------------+
              |    |Master/Slave|
              |----|            |@---+
              |    |Negotiation |    |
              |    +------------+    |
              |          |           |
              |          |           |
              |          @           |
              |    +------------+    |
              |    |   Cache    |    |
              |----|            |----|
              |    | Summarize  |    |
              |    +------------+    |
              |          |           |
              |          |           |
              |          @           |
              |    +------------+    |
              |    |   Update   |    |
              |----|            |----|
              |    |   Cache    |    |
              |    +------------+    |
              |          |           |
              |          |           |
              |          @           |
              |    +------------+    |
              |    |            |    |
              +----|  Aligned   |----+
                   |            |
                   +------------+

     Figure 2: Cache Alignment Finite State Machine







Luciani, et al.                                                 [Page 6]


INTERNET-DRAFT                    SCSP                 Expires June 1997


2.2 Cache Alignment Protocol

   "Cache Alignment" (CA) messages are used by an LS to synchronize its
   cache with that of the cache of each of its DCSs.  That is, CA
   messages allow a booting LS to synchronize with each of its DCSs.  A
   CA message contains a CA header followed by zero or more Client State
   Advertisement Summary records (CSAS records).

   An LS has a Cache Alignment Finite State Machine (CAFSM) associated
   (see Figure 2) with each of its DCSs on a per SG basis, and the CAFSM
   monitors the state of the cache alignment between the servers.  The
   CAFSM starts in the Down State.  The CAFSM is associated with an
   HFSM, and when that HFSM reaches the Bidirectional State, the CAFSM
   transitions to the Master/Slave Negotiation State.  The Master/Slave
   Negotiation State causes either the LS or DCS to take on the role of
   master over the cache alignment process.

   When the LS's CAFSM reaches the Master/Slave Negotiation State, the
   LS will send a CA message to the DCS associated with the CAFSM.  The
   format of CA messages are described in Section B.1.1.  The first CA
   message which the LS sends includes no CSAS records and a CA header
   which contains the LSID in the Sender ID field, the DCSID in the
   Receiver ID field, a CA sequence number, and three bits.  These three
   bits are the M (Master/Slave) bit, the I (Initialization of master)
   bit, and the O (More) bit. In the first CA message sent by the LS to
   a particular DCS, the M, O, and I bits are set to one.  If the LS
   does not receive a CA message from the DCS in CAReXmtInterval seconds
   then it resends the CA message it just sent.  The LS continues to do
   this until the CAFSM transitions to the Cache Summarize State or
   until the HFSM transitions out of the Bidirectional State. Any time
   the HFSM transitions out of the Bidirectional State, the CAFSM
   transitions to the Down State.

   When the LS receives a CA message from the DCS while in the
   Master/Slave Negotiation State, the role the LS plays in the exchange
   depends on packet processing as follows:

   1) If the CA from the DCS has the M, I, and O bits set to one and there are
      no CSAS records in the CA message and the Sender ID as specified in the
      DCS's CA message is larger than the LSID then
     a) The timer counting down the CAReXmtInterval is stopped.
     b) The CAFSM corresponding to that DCS transitions to the Cache Summarize
        State and the LS takes on the role of slave.
     c) The LS adopts the CA sequence number it received in the CA message as its
        own CA sequence number.
     d) The LS sends a CA message to the DCS which is formated as follows:
        the M and I bits are set to zero, the Sender ID field is set to the
        LSID, the Receiver ID field is set to the DCSID, and the CA sequence



Luciani, et al.                                                 [Page 7]


INTERNET-DRAFT                    SCSP                 Expires June 1997


        number is set to the CA sequence number that appeared in the DCS's
        CA message.  If there are CSAS records to be sent (i.e., if the LS's
        cache is not empty) then the O bit is set to one and the initial set
        of CSAS records are included in the CA message.

   2) If the CA message from the DCS has the M and I bits off and the Sender ID
      as specified in the DCS's CA message is smaller than the LSID then
     a) The timer counting down the CAReXmtInterval is stopped.
     b) The CAFSM corresponding to that DCS transitions to the Cache Summarize
        State and the LS takes on the role of master.
     c) The LS must process any CSAS records in the received CA.
        An explanation of record processing is given below.
     d) The LS sends a CA message to the DCS which is formated as follows:
        the M bit is set to one, I bit is set to zero, the Sender ID
        field is set to the LSID, the Receiver ID field is set to the DCSID,
        and the LS's current CA sequence number is incremented by one and placed
        in the CA message.   If there are any CSAS records to be sent from the
        LS to the DCS (i.e., if the LS's cache is not empty) then the O bit is
        set to one and the initial set of CSAS records are included in the
        CA message that the LS is sending to the DCS.

   3) Otherwise, the packet must be ignored.

   At any given time, the master or slave have at most one outstanding
   CA message.  Once the LS's CAFSM has transitioned to the Cache
   Summarize State the sequence of exchanges of CA messages occurs as
   follows.

   1) If the LS receives a CA message with the M bit set incorrectly
      (e.g., the M bit is set in the CA of the DCS and the LS is master)
      or if the I bit is set then the CAFSM transitions back to the
      Master/Slave Negotiation State.

   2) If the LS is master and the LS receives a CA message with a CA sequence
      number which is one less than the LS's current CA sequence number then
      the message is a duplicate and the message MUST be discarded.

   3) If the LS is master and the LS receives a CA message with a CA sequence
      number which is equal to the LS's current CA sequence number then the
      CA message MUST be processed.  An explanation of "CA message processing"
      is given below.  As a result of having received the CA message from
      the DCS the following will occur:
     a) The timer counting down the CAReXmtInterval is stopped.
     b) The LS must process any CSAS records in the received CA message.
     c) Increment the LS's CA sequence number by one.
     d) The cache exchange continues as follows:
       1) If the LS has no more CSAS records to send and the received CA
          message has the O bit off then the CAFSM transitions to the Update



Luciani, et al.                                                 [Page 8]


INTERNET-DRAFT                    SCSP                 Expires June 1997


          Cache State.
       2) If the LS has no more CSAS records to send and the received CA
          message has the O bit on then the LS sends back a CA message
          (with new CA sequence number) which contains no CSAS records and
          with the O bit off.  Reset the timer counting down the
          CAReXmtInterval.
       3) If the LS has more CSAS records to send then the LS sends the next
          CA message with the LS's next set of CSAS records.  If LS is sending
          its last set of CSAS records then the O bit is set off otherwise the
          O bit is set on. Reset the timer counting down the CAReXmtInterval.

   4) If the LS is slave and the LS receives a CA message with a CA sequence
      number which is equal to the LS's current CA sequence number then the
      CA message is a duplicate and the LS MUST resend the CA message
      which it had just sent to the DCS.

   5) If the LS is slave and the LS receives a CA message with a CA sequence
      number which is one more than the LS's current CA sequence number then
      the message is valid and MUST be processed.  An explanation of "CA message
      processing" is given below.  As a result of having received the CA
      message from the DCS the following will occur:

     a) The LS must process any CSAS records in the received CA message.
     b) Set the LS's CA sequence number to the CA sequence number in the CA
        message.
     c) The cache exchange continues as follows:
       1) If the LS had just sent a CA message with the O bit off and the
          received CA message has the O bit off then the CAFSM transitions to
          the Update Cache State and the LS sends a CA message with no CSAS
          records and with the O bit off.
       2) If the LS still has CSAS records to send then the LS MUST send
          a CA message with CSAS records in it.  If the message being sent
          from the LS to the DCS contains the last CSAS records that the
          LS needs to send then the CA is sent with the O bit off.

   6) If the LS is slave and the LS receives a CA message with a CA sequence
      number that is neither equal to or one more than the current LS's
      CA sequence number then an error has occurred and the CAFSM transitions
      to the Master/Slave Negotiation State.

   "CA message processing" occurs as follows:

   The LS makes a list of those cache entries which are more "up to
   date" in the DCS than the LS's own cache.  The previously mentioned
   list is called the CSA Request List (CRL).  See Section 2.4 for a
   description of what it means for a CSA or CSAS to be more "up to
   date" than an LS's cache entry.




Luciani, et al.                                                 [Page 9]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   If the CRL of the LS is empty upon transition into the Update Cache
   State then the CAFSM immediately transitions into the Aligned State.
   If the CRL is not empty then the LS solicits the DCS to send the CSA
   records corresponding to the summaries (i.e., CSAS records) which the
   LS holds in its CRL.  The solicited CSA records will contain the
   entirety of the client information held in the DCS's cache for the
   given client.  The LS solicits the relevant CSA records by forming
   CSU Solicit (CSUS) messages from the CRL. CSUS messages contain a
   CSUS header and CSAS records from the CRL.  The LS then sends the
   CSUS messages to the DCS. The DCS responds to the CSUS messages by
   sending CSU messages (see Section 2.3) containing the appropriate CSA
   records to the LS.  At most one CSUS message may be outstanding at
   any given time.

   Just before the first CSUS message is sent from an LS to the DCS
   associated with the CAFSM, a timer is set to CSUSReXmtInterval
   seconds.  If all the CSA records corresponding to the CSAS records in
   the CSUS message have not been received by the time that the timer
   expires then a new CSUS message will be created which contains all
   the CSAS records for which no appropriate CSA record has been
   received plus additional CSAS records not covered in the previous
   CSUS message.  The new CSUS message is then sent to the DCS.  If, at
   some point before the timer expires, all CSA record updates have been
   received for all the CSAS records included in the previously sent
   CSUS message then the timer is stopped and if there are additional
   CSAS records that were not covered in the previous CSUS message but
   were in the CRL then the timer is reset and a new CSUS message is
   created which contains only those CSAS records from the CRL which
   have not yet been sent to the DCS. This process continues until all
   the CSA records corresponding CSAS records that were in the CRL have
   been received by the LS.  When the LS has a completely updated cache
   then the LS transitions CAFSM associated with the DCS to the Aligned
   State.

   If an LS receives a CSUS message or a CA message with a Receiver ID
   which is not the LSID then the message must be discarded and ignored.
   This is necessary since an LS may be in a point to multipoint mesh
   with some of its DCSs.


2.3 Client State Update Protocol

   "Client State Update" (CSU) messages are used to update the state of
   cache entries in servers. CSU messages contain zero or more "Client
   State Advertisement" (CSA) records each of which contains a SGID in
   the record.  Thus CSU messages may service more than one SG as long
   as the sending and receiving servers in a given SG have a CAFSM in
   the Aligned state (or sometimes the Update Cache state).  This is a



Luciani, et al.                                                [Page 10]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   fundamental difference between the CSU protocol and either the Hello
   protocol or the Cache Alignment protocol.  An LS may send/receive a
   CSU to/from a DCS only when the corresponding CAFSM is in either the
   Aligned State or the Update Cache State.

   There are two types of CSU messages: CSU Requests and CSU Replies.  A
   CSU Request message is sent from an LS to each of its DCSs when the
   LS directly observes changes in the state of one or more clients in
   the SG. The change in state of a particular client is noted in a CSA
   record which is then embedded in the CSU Request message.  In this
   way, state changes are propagated throughout the SG.

   Examples of such changes in state are as follows:


       1) an LS receives a request from a client to add an entry to its cache
          (e.g., NHRP Registration Request or an administrative
          intervention),

       2) an LS receives a request from a client to remove an entry from its cache
          (e.g., NHRP Purge Request or administrative intervention),

       3) a cache entry has timed out in the LS's cache, has been refreshed
          in the LS's cache, or has been administratively modified
          (e.g., in NHRP, an Internetworking address to NBMA address binding
          has timed out or has been refreshed).

   When an LS receives a CSU Request from one of its DCSs, the LS
   acknowledges the CSU Request by sending a CSU Reply. The CSU Reply
   contains those CSA records which were contained in the CSU Request
   which the LS is capable of processing;  e.g., if a CSA record is
   dropped because there are insufficient resources to process it then
   that CSA record is not included in the CSU Reply.  When a CSA record
   received in a CSU Request is considered by the LS to represent client
   information which is more "up to date" (see Section 2.4) than the
   client information contained within the cache of the LS then two
   things happen:  1) the LS's cache is updated with the more up to date
   information, and 2) the LS sends a CSU Request containing the CSA
   Record to each of its DCSs except the one from which the CSA Record
   arrived.  In this way,  state changes are propagated throughout the
   SG.

   When an LS sends a new CSU Request to one of its DCSs, the LS keeps
   track of the outstanding CSA records (i.e., those which have not been
   acknowledged yet) in that CSU Request, the CSU Sequence number in
   that CSU Request, and to which DCS the LS sent the CSU Request.  A
   timer set to CSUReXmtInterval seconds is started just prior to
   sending a new CSU Request and that timer is associated with the CSU



Luciani, et al.                                                [Page 11]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Sequence number in the CSU Request such that if that timer expires
   prior to the receipt of a CSU Reply containing the same CSU Sequence
   number then an exact copy of the CSU Request (including the same CSU
   Sequence number) is re-sent.

   When an LS receives a CSU Reply from one of its DCSs, the LS checks
   the CSU Sequence number in the CSU Reply against the CSU Sequence
   number of outstanding CSU Requests which have not yet been
   acknowledged.   Processing proceeds as follows:
     1) If a match is found then
       a) If the "A" bit is zero in the CSU Reply then
         1) The LS checks the CSA records in the CSU Reply against the
            CSA Records    which    were sent in the CSU Request and
            any discrepancies will cause a "follow up" CSU Request
            containing an new CSU Sequence Number to be sent which will
            contain exactly those CSA Records which were in the previous
            matching CSU Request but were not in the CSU Reply.  Note
            that follow up CSU Request follows the same time-out and
            retransmit rules for "new" CSU Requests.  This process
            continues until all CSA records are acknowledged by the
            receipt of a CSU Reply which contains them.  When a
            CSA Record is acknowledged in a CSU Reply then that
            CSA record is removed from the list of outstanding
            CSA records for that DCS.  CSA records in a "follow up"
            CSU Request are associated with the new CSU Sequence number
            rather than the previous CSU Sequence number with which they
            were formerly associated.
       b) If the "A" bit is set to one in the CSU Reply then
         1) All CSA records associated with the CSU Sequence number
            in the CSU Reply are acknowledged even though they do not
            actually appear in the CSU Reply.  This is by far the
            preferred method of operation since it is much less
            computationally intensive.
         2) It is strongly RECOMMENDED that as new protocols make use
            SCSP that those protocols mandate setting of the A bit
            whenever possible.
     2) If no match is found then
       a) The CSU Reply is silently dropped.

   An LS responds to CSUS messages from its DCSs by sending CSU Request
   messages containing the appropriate CSA records to the DCS.  If an LS
   receives a CSUS message containing a CSAS record for an entry which
   is no longer in its database (e.g., the entry timed out and was
   discarded after the Cache Alignment exchange completed but before the
   entry was requested through a CSUS message), then the LS will respond
   with a CSU message containing a CSA record which indicates a client
   state of "client entry does not exist".




Luciani, et al.                                                [Page 12]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   If an LS receives a CSU with a Receiver ID which is not equal to the
   LSID and is not set to all 0xFFs then the CSU must be discarded and
   ignored.  This is necessary since the LS may be a leaf of a point to
   multipoint connection in a point to multipoint mesh with a set of its
   DCSs.

   An LS MAY send a CSU Request to the all 0xFFs Receiver ID when the LS
   is in a point to multipoint mesh with a set of its DCSs.  If an LS
   receives a CSU Request with the all 0xFFs Receiver ID then it MUST
   use the Sender ID in the CSU Request as the Receiver ID of the CSU
   Reply (i.e., it MUST unicast its response to the sender of the
   request) when responding.  If the LS wishes to send a CSU Request to
   the all 0xFFs Receiver ID then it MUST create a time-out and
   retransmit timer for each of the DCSs which are in the point to
   multipoint mesh prior to sending the CSU Request.  If in this case,
   the time-out and retransmit timer expires for a given DCS prior to
   acknowledgment of the CSU Request then the LS MUST use the specific
   DCSID as the Receiver ID rather than the all 0xFFs Receiver ID.
   Similarly, if it is necessary to send a follow up CSU Request then
   the LS MUST specify the specific DCSID as the Receiver ID rather than
   the all 0xFFs Receiver ID.

2.4 The meaning of "More Up To Date"

   During the cache alignment process and during normal CSU processing,
   a CSA record or CSAS record is compared against the contents of an
   LS's cache entry to decide whether the information contained in the
   record is more "up to date" than the corresponding cache entry of the
   LS.  Unfortunately, this is not a simple decision and is highly
   dependent on the type of server which is being synchronized with its
   peer servers.  A given server type has a particular set of search
   keys when looking up information in its database, and the choice of
   search keys is highly dependent on the given client/server protocol
   in which the server participates.  This set of keys will be referred
   to as the "search string" for the rest of this section.  The search
   string is extracted from the given record (CSA or CSAS).  The search
   string is insufficient, however, to divine the "up to date"-ness of a
   given record over that of a server's cache entry. This is because a
   server may employ the concept of "refreshing client information", and
   thus it would be difficult to tell the difference between a record
   which is being used to perform a refresh and a record which has
   merely circulated around the servers in a SG and has subsequently
   returned to a server to which the record has already been.  Thus, a
   CSA record has a CSA Sequence number associated with it which says
   this is a "newer" CSA record than any previously received.  This
   newness is a result of the fact that, with the exception of the use
   of fragments which are described in the CSA record packet format
   sections below, CSA Sequence numbers are assigned in a monotonically



Luciani, et al.                                                [Page 13]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   increasing fashion by an LS to each new CSA record (be it for new
   client information or for refresh) that an LS originates.  An LS
   originates a CSA record if it is the server which inserted the client
   information into the SG.  The LS is not an originator if the LS
   received the CSA record from one of its DCSs and merely passed on the
   CSA record to its DCSs.  Thus if an LS is not an originator for the
   CSA record then the LS does not fiddle with the CSA Sequence number
   before passing the CSA record onto to its DCSs.  Note that the CSAS
   records contains a CSA Sequence number which is partly how newness is
   measured during the cache alignment process.

   Given the previous paragraph, there would appear to be three pieces
   of information which are used in the divination of whether a record
   contains information which is more "up to date" than the information
   contained in the cache entry of an LS which is processing the record:
   1) the search string, 2) the CSA Sequence number, and 3) the
   Originator which is described by an Originator ID (OID).  Note that
   in some case the OID may actually be part of the search string
   depending on the client/server protocol.  Given these three pieces of
   information, a record is considered to be more "up to date" than the
   information contained in the cache of an LS if one of the following
   is true:
     1) The search string does not match a cache entry in the LS
     2) The search string does match a cache entry in the LS and
        the OID in the record is the same as the OID in the LS's
        cache entry but the CSA Sequence number in the record is
        larger than the CSA Sequence number in the LS's cache
        entry
     3) The search string does match a cache entry in the LS and
        the OID in the record is different from the OID in the
        LS's cache entry
       a) If a given client/server protocol permits client
          information to be simultaneously originated by two or more
          servers then this represents a new cache entry for the LS
          and not an update of the existing one
         1) This occurs, for example, in NHRP when an NHC is registered
            with two or more NHSs
       b) If a given client/server protocol does not permits client
          information to be simultaneously originated by two or more
          servers then some other method of determination must be used
          which is specific to the client/server protocol
         1) For example, always choose either the record or the cache
            entry with the larger OID

   Note that ideally, the CSAS record (including both the generic and
   client/server protocol specific parts) contains only the three pieces
   of information mentioned above and nothing else.




Luciani, et al.                                                [Page 14]


INTERNET-DRAFT                    SCSP                 Expires June 1997


Discussion and conclusions

   While the above text is couched in terms of synchronizing the
   knowledge of the state of a client within the cache of servers
   contained in a SG, this solution generalizes easily to any number of
   database synchronization problems (e.g., LECS synchronization).

   If it were desirable to advertise adjacency information between
   servers then this could be done trivially by appropriating a bit in
   the CSU message which merely stated that the information contained in
   the one and only CSA record contained therein holds adjacency
   information.  Such a CSA record's protocol specific part would
   contain the OID (the LSID) and the DCSID (and potentially a link cost
   if one wanted to get fancy).  This technique might allow for an
   entity to semi-trivially obtain a topo-map of the servers in a SG.

   The appendices below show examples of how SCSP is to be implemented
   for the specified protocols.

Appendix A:  Terminology

  This appendix introduces the terminology associated with SCSP.

A.1 Abbreviations

   CA - Cache Alignment Message

   CAFSM - Cache Alignment Finite State Machine

   CID - Client ID

   CRL - CSA Request List

   CSA - Client State Advertisement

   CSAS - Client State Advertisement Summary

   CSU - Client State Update

   CSUS - Client State Update Solicit

   DCS - Directly Connected Server

   HFSM - Hello Finite State Machine

   I - Initialize bit

   LS - Local Server



Luciani, et al.                                                [Page 15]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   LSID - Local Server ID

   M - Master/Slave bit

   O - More bit

   RS - Remote Server

   SG - Server Group

   SID - Server ID

A.2 Definitions

   Cache Alignment message (CA message)
     These messages allow an LS to synchronize its entire cache
     with that of the cache of one of its DCSs.

   Cache Alignment Finite State Machine (CAFSM)
     The CAFSM monitors the state of the cache alignment between an LS
     and a particular DCS.  There exists one CAFSM per DCS as seen from
     an LS.

   Client ID (CID)
     The CID is an unique token which identifies a client whose state
     is being kept in a server's cache.  This value
     might be taken from the protocol address of the client.

   CSA Request List (CRL)
     When CA messages are exchanged between an LS and one of its DCSs,
     the LS makes a list of those cache entries which are more recent
     in the DCS (based on a CSAS sequence number) than the LS's own
     entry and adds to that list any entry in the DCS which is not already
     in its cache. This list is the CRL.

   Client State Advertisement record (CSA record)
     A CSA is a record within a CSU message which identifies an update
     to the status of a "particular" client.

   Client State Advertisement Summary record (CSAS record)
     A CSAS contains a summary of the information in a CSA.  A server will
     send CSAS records describing its cache entries to another server
     during the cache alignment process.  CSAS records are also included
     in a CSUS messages when an LS wants to request the entire CSA from
     the DCS.  The LS is requesting the CSA from the DCS because the LS
     believes that the DCS has a more recent view of the state of the
     cache entry in question.




Luciani, et al.                                                [Page 16]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Client State Update message (CSU message)
     This is a message sent from an LS to its DCSs when the LS
     becomes aware of a change in state of a client.

   Client State Update Solicit message (CSUS message)
     This message is sent by an LS to its DCS after the LS and DCS
     have exchanged CA messages.   The CSUS message contains one or more
     CSAS records which represent solicitations for entire CSA records
     (as opposed to just the summary information held in the CSAS).

   Directly Connected Server (DCS)
     The DCS is a server which is directly connected to the LS;
     e.g., there exists a VC between the LS and DCS.
     This term, along with the terms LS and RS, is used to give a frame
     of reference when talking about servers and their synchronization.
     Unless explicitly stated to the contrary, there is no implied
     difference in functionality between a DCS, LS, and RS.

   Hello Finite State Machine (HFSM)
     An LS has a HFSM associated with each of its DCSs.  The HFSM monitors
     the state of the connectivity between the LS and a particular DCS.

   Initialize bit (I bit)
     This bit is included in a CA message.  When set, this bit indicates
     that the sender of the CA wishes to negotiate for Master/Slave server
     status in the cache alignment process.

   Local Server (LS)
     The LS is the server under scrutiny; i.e., all statements are made
     from the perspective of the LS.
     This term, along with the terms DCS and RS, is used to give a frame
     of reference when talking about servers and their synchronization.
     Unless explicitly stated to the contrary, there is no implied
     difference in functionality between a DCS, LS, and RS.

   Local Server ID (LSID)
     The LSID is a unique token that identifies an LS.  This value
     might be taken from the protocol address of the LS.

   Master/Slave bit (M bit)
     This bit is included in a CA message.  When set, this bit indicates
     that the sender of the CA wishes to be Master of the cache alignment
     process.

   More bit (O bit)
     This bit is included in a CA message.  When set, this bit indicates
     that the sender of the CA has more CA messages to send above and
     beyond the message it is currently sending.



Luciani, et al.                                                [Page 17]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Remote Server (RS)
     An RS is a server that is neither an LS nor a DCS and unless otherwise
     stated an RS refers to a server in the SG.
     This term, along with the terms LS and DCS, is used to give a frame
     of reference when talking about servers and their synchronization.
     Unless explicitly stated to the contrary, there is no implied
     difference in functionality between a DCS, LS, and RS.

   Server Group (SG)
     The SCSP synchronizes caches (or a portion of the caches) of a set
     of server entities which are bound to a SG through some means
     (e.g., all servers belonging to a Logical IP Subnet (LIS)[1]).  Thus
     an SG is just a grouping of servers around some commonality.

   Server Group ID (SGID)
     This ID is a 32 bit identification field that uniquely identifies
     both the client/server protocol for which the servers of the SG are
     being synchronized as well as the instance of that protocol.
     This implies that multiple instances of the same protocol may be in
     operation at the same time and have their servers synchronized
     independently of each other.

   Server ID (SID)
     The SID is a unique token that identifies a given server.  This value
     might be taken from the protocol address of the server.

Appendix B:  Packet Formats

B.1 SCSP Message Formats

   This section of the appendix includes the message formats for SCSP.
   SCSP protocols are LLC/SNAP encapsulated with an LLC=0xAA-AA-03 and
   OUI=0x00-00-5e and PID=0x00-05.

   SCSP has 3 parts to every packet: the fixed part, the mandatory part,
   and the TLV part.  The fixed part of the message exists in every
   packet and is shown below.  The mandatory part is specific to the
   particular message type (i.e., CA, CSU Request/Reply, Hello, CSUS).
   The TLV part has not yet been defined for SCSP but it will contain
   the set of TLVs for a particular SCSP message.











Luciani, et al.                                                [Page 18]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Fixed Header:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Version    |  Type Code    |        Packet Size            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Checksum             |          Start Of TLVs        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Version
     This is the version of the SCSP protocol being used.  The current
     version is 1.

   Type Code
     This is the code for the message type (e.g., Hello (5), CSU
     Request(2), CSU Reply(3), CSUS (4), CA (1)).

   Packet Size
     The total length of the SCSP packet, in octets (excluding link
     layer and/or other protocol encapsulation).

   Checksum
     The standard IP checksum over the entire NHRP packet (starting with
     the fixed header).  If only the hop count field is changed, the
     checksum is adjusted without full recomputation.  The checksum is
     completely recomputed when other header fields are changed.

   Start Of TLVs
     There are no TLVs currently specified for SCSP. This field will be
     coded as zero until such time that TLVs are defined at which point
     this field will be coded with the offset from the top of the fixed
     header to the beginning of the first TLV.


B.1.1 Cache Alignment (CA)

   The Cache Alignment (CA) message allows an LS to synchronize its
   entire cache with that of the cache of its DCSs within a server
   group. The CA message type code is 1. The CA message format is as
   follows:










Luciani, et al.                                                [Page 19]


INTERNET-DRAFT                    SCSP                 Expires June 1997


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Sender ID Len | Recvr ID Len  |M|I|O|u|    Number of CSASs    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   CA  Sequence Number                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Server Group ID                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Sender ID (variable length)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                Receiver ID (variable length)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        CSAS Record                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                              .......
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        CSAS Record                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Sender ID Len
     This field holds the length in octets of the Sender ID.

   Recvr ID Len
     This field holds the length in octets of the Receiver ID.

   M
     This bit is part of the negotiation process for the cache
     alignment.  When this bit is set then the sender of the CA message
     is indicating that it wishes to lead the alignment process.  This
     bit is the "Master/Slave bit".

   I
     When set, this bit indicates that the sender of the CA message
     believes that it is in a state where it is negotiating for the
     status of master or slave.  This bit is the "Initialization bit".

   O
     This bit indicates that the sender of the CA message has more CSAS
     records to send.  This implies that the cache alignment process
     must continue.  This bit is the "More bit" despite its dubious
     name.

   u
     unused

   Number of CSASs
     This field contains the number of Client State Advertisements



Luciani, et al.                                                [Page 20]


INTERNET-DRAFT                    SCSP                 Expires June 1997


     Summaries (CSASs) contained in the CA message.

   CA Sequence Number
     A value which provides a unique identifier to aid in the sequencing
     of the cache alignment process.  The slave server always copies the
     sequence number from the master server's previous CA message into
     its current CA message thus acknowledging the master's CA message.
     When the slave receives a "higher" sequence number then the number
     that the slave previously sent then the slave's previous CA message
     is acknowledged.  A "larger" sequence number means a more recent CA
     message.  Note that there is a separate CA Sequence Number space
     associated with each CAFSM.

   Server Group ID
     This ID is a 32 bit identification field that uniquely identifies
     both the client/server protocol for which the servers of the SG are
     being synchronized as well as the instance of that protocol.  This
     implies that multiple instances of the same protocol may be in
     operation at the same time and have their servers synchronized
     independently of each other.

   Sender ID
     This is the protocol address of the server which is sending the CA
     message.

   Receiver ID
     This is the protocol address of the server which is to receive the
     CA message.

   CSAS record
     See Section B.1.1.1.


B.1.1.1 Client State Advertisement Summary Record (CSAS record)

   CSAS records contain a generic header and a client/server protocol
   specific part.  The generic header is shown below and the
   client/server protocol specific part MUST be documented separately
   for each such protocol.  Examples of the protocol specific parts for
   NHRP and ATMARP are shown in appendices below.  See the specific
   protocol appendix or the appropriate other document for the protocol
   specific part of the CSAS.

   Note that CSAS records do not contain a Server Group ID (SGID) since
   cache alignments are performed on a per SG basis.






Luciani, et al.                                                [Page 21]


INTERNET-DRAFT                    SCSP                 Expires June 1997


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    CSA Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Client/Server Protocol Specific Part ...            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   CSA Sequence Number
     This field contains a sequence number that identifies the CSA
     record instance for the given client.  A "larger" sequence number
     means a more recent advertisement.


B.1.2 Client State Update Request (CSU Request)

   The Client State Update Request (CSU Request) message is used to
   update the state of cache entries in servers which are attached to
   the server sending the message.   A CSU Request message is sent from
   one server (the LS) to another directly connected server (the DCS)
   when the LS observes changes in the state of one or more clients.
   This observation may be a result of receiving a CSU from another DCS
   or as a result of some event occurring for a client that has
   registered with it.  The change in state of a "particular" client is
   noted in a CSU message via a "Client State Advertisement" (CSA)
   record within the CSU.  The CSU Request message type code is 2.  The
   CSU Request message format is as follows:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Sender ID Len | Recvr ID Len  |A|P|uuu|    Number of CSAs     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   CSU Sequence Number                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Sender ID (variable length)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                Receiver ID (variable length)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         CSA Record                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                              .......
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         CSA Record                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Sender ID Len
     This field holds the length in octets of the Sender ID.



Luciani, et al.                                                [Page 22]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Recvr ID Len
     This field holds the length in octets of the Receiver ID.

   A
     The A bit is not meaningful in the CSU Request and MUST be set to
     zero.  In the CSU Reply, when the A bit is set, it signifies that
     all CSA records from the CSU Request are being acknowledged, and
     the CSA records are not repeated in the CSU Reply message.  In this
     case, the Number of CSAs field is set to 0 and the CSU Sequence
     Number in the CSU Reply is used to verify the acknowledgement of
     CSU Request which contained the same CSU Sequence Number.

   P
     When the P bit is set in a CSU Request it means that this CSU
     contains CSA records which are of highest priority and that
     processing of this CSU takes precedence over all other CSU
     processing for the SG.  The rules for setting of the P bit are on a
     SG basis (e.g., NHRP may have a different set of rules from
     ATMARP); i.e., the implementor MAY NOT indiscriminately set this
     bit and still be compliant with the SCSP protocol.  One side effect
     of setting this bit is that the CSU MUST carry exactly one SG's CSA
     records.

   Number of CSAs
     This field contains the number of Client State Advertisements
     (CSAs) contained in the CSU message.

   CSU Sequence Number
     A value which, when coupled with the address of the source,
     provides a unique identifier for the CSU Request This value is
     equivalent to the CSU Sequence Number in SCSP. A "larger" sequence
     number means a more recent advertisement.

   Sender ID
     This is the protocol address of the server which is sending the CSU
     message.

   Receiver ID
     This is the protocol address of the server which is to receive the
     CSU message. The use of the all 0xFFs Receiver ID is described in
     Section 2.3 for CSU messages.

   CSA Record
     See Section B.1.2.1.


B.1.2.1 Client State Advertisement Record (CSA record)




Luciani, et al.                                                [Page 23]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   CSA records contain the information necessary to relate the current
   state of a client in an SG to the servers being synchronized.  CSA
   records contain a generic header and a client/server protocol
   specific part.  The generic header is shown below and the
   client/server protocol specific part MUST be documented separately
   for each such protocol.  Examples of the protocol specific parts for
   NHRP and ATMARP are shown in appendices below.  See the specific
   protocol appendix or the appropriate other document for the protocol
   specific part of the CSA.

   Note that CSA records do contain a Server Group ID (SGID) since CSU
   messages may carry CSA records from multiple SGs.

   The amount of information carried by a specific CSA record may exceed
   the size of a link layer PDU.  Hence, such CSA records MUST be
   fragmented across a number of CSU Request messages. CSA Record
   fragments carry a 15 bit Fragment Number and an F bit which denotes
   that this fragment is the last fragment for the CSA.  Fragments MUST
   be transmitted in order of their fragment sequence numbers. All
   fragments of a CSA record MUST carry the same CSA Sequence number.
   All but the final fragment MUST have the F bit set to zero. The final
   fragment SHALL have the F bit set to one.  Complete re-assembly of a
   CSA record requires collecting all CSA record fragments referring to
   the same client information.  A CSA record MUST NOT be processed
   until it is completely re-assembled.  If the CSA Sequence Number
   changes during the re-assembly of a fragmented CSA record then the
   fragments collected thus far are discarded.  In terms of time-out and
   retransmit of CSA record fragments in CSU messages, the same rules
   apply as described in Section 2.3.

   Thus, for example, if it takes three CSU packets to contain the
   entire CSA record then the first packet has 0x00 in the combination
   of "F" and Fragment Number fields, the next CSU packet has 0x01 in
   those fields, and the final packet has 0x82 in that field.  However,
   in general, fragmentation will not be necessary and the F bit
   concatenated with the Fragment Number will be coded as 0x81 since the
   Fragment Number starts at one for each CSA Record.

   It is strongly RECOMMENDED that if a CSA record cannot fit into a
   single link layer PDU then each CSA record fragment is assigned to
   CSU Request message which carries only that CSA record fragment and
   that the CSU Reply message set the "A" flag to one (see Section
   B.1.2).

   The content of a CSA record is as follows:






Luciani, et al.                                                [Page 24]


INTERNET-DRAFT                    SCSP                 Expires June 1997


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|    Fragment Number          |            TTL                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    CSA Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Server Group ID                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Client/Server Protocol Specific Part ...            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   F
     The F bit is used to denote that the current fragment is the final
     fragment of the CSA record.  When a single CSA record cannot fit
     within an SCSP PDU, it is necessary to break the CSA record up into
     fragments each of which contains the same CSA Sequence number but a
     different Fragment Number (described below).

   Fragment Number
     When a single CSA record cannot fit within an SCSP PDU, it is
     necessary to break the CSA record up into fragments each of which
     contains the same CSA Sequence number but a different Fragment
     Number.  This number starts at one and increments by for each CSU
     packet that the fragment is spread across.  Thus if it takes three
     CSU packets to contain the entire CSA record then the first packet
     has 0x01 in the combination of "F" and Fragment Number fields, the
     next CSU packet has 0x02 in those fields, and the final packet has
     0x83 in that field.  In general, fragmentation will not be
     necessary and the F bit concatenated with the Fragment Number will
     be merely coded as 0x81 for each CSA record.

   TTL
     Time to live for the packet.  This represents the number of hops
     that the CSA takes before it is dropped and thus at each server
     that the CSA record traverses, the TTL is decremented.

   CSA Sequence Number
     This field contains a sequence number that identifies the CSA
     record instance for the given client.  A "larger" sequence number
     means a more recent advertisement.

   Server Group ID
     This ID is a 32 bit identification field that uniquely identifies
     both the client/server protocol for which the servers of the SG are
     being synchronized as well as the instance of that protocol.  This
     implies that multiple instances of the same protocol may be in
     operation at the same time and have their servers synchronized



Luciani, et al.                                                [Page 25]


INTERNET-DRAFT                    SCSP                 Expires June 1997


     independently of each other.


B.1.3 Client State Update Reply (CSU Reply)

   The Client State Update Reply (CSU Reply) message is used to
   acknowledge the reception of Client State Update Request.  A CSU
   Reply message is sent from one server (the DCS) to the server (the
   LS) which sent the original CSU Request.  The CSU Reply message type
   code is 3.  The CSU Reply message format is the same as that of the
   CSU Request so that when an server receives an CSU Request all that
   needs to be done to reply to it is to change the type code to 3 and
   send the message back.

B.1.4 Client State Update Solicit Message (CSUS message)

   This message allows one server (LS) to solicit the entirety of CSA
   data stored in the cache of a directly connected server (DCS).  The
   DCS responds with CSU messages containing the appropriate CSAs.  The
   CSUS message type code is 4.  The CSUS message format is the same as
   that of the CA message; however the M, I, and O bits are not
   meaningful in this context and are set to zero.  Also, the CSUS
   Sequence Number is from a different numbering space than the CA
   Sequence number.  CSUS messages solicit CSUs from only one server of
   a SG at a time. That SG is the one identified by the SGID in the CSUS
   header and the server is identified by the Receiver ID field in CSUS
   header.


B.1.5 Hello:

   The Hello message is used to check connectivity between the sending
   server (the LS) and one of its directly connected neighbor servers
   (the DCSs).  The Hello message type code is 5.  The Hello message
   format is as follows:
















Luciani, et al.                                                [Page 26]


INTERNET-DRAFT                    SCSP                 Expires June 1997


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Sender ID Len | Recvr ID Len  |  Number of Receiver IDs Heard |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         HelloInterval         |          DeadFactor           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Server Group ID                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Sender ID (variable length)                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Receiver ID (variable length)                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                           .........
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Receiver ID (variable length)                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Sender ID Len
     This field holds the length in octets of the Sender ID.

   Recvr ID Len
     This field holds the length in octets of the Receiver ID.

   Number of Receiver IDs Heard
     This field holds the count of the Receiver ID which are listed in
     this packet.

   HelloInterval
     The hello interval advertises the time between sending of
     consecutive Hello Messages by an LS.  If the time between Hello
     messages exceeds the HelloInterval then the Hello is to be
     considered late by the DCS.  On the other hand, if the LS does not
     receive a Hello Reply within its HelloInterval then the LS resends
     the same Hello message it sent previously

   DeadFactor
     This is a multiplier to the HelloInterval. If a DCS does not
     receive a Hello message within the interval
     HelloInterval*DeadFactor from an LS that advertised the
     HelloInterval then the DCS MUST consider the LS to be stalled at
     which point the DCS should transition to the Waiting State.   On
     the other hand, if the LS does not receive a Hello Reply within
     DeadFactor*HelloInterval then one of two things happens: 1) if the
     LS has received Hello messages from the DCS during this time then
     the LS transitions to the Unidirectional State; otherwise, 2) the
     LS transitions to the Waiting State.




Luciani, et al.                                                [Page 27]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   Server Group ID
     This ID is a 32 bit identification field that uniquely identifies
     both the client/server protocol for which the servers of the SG are
     being synchronized as well as the instance of that protocol.  This
     implies that multiple instances of the same protocol may be in
     operation at the same time and have their servers synchronized
     independently of each other.

   Sender ID
     This is the protocol address of the server which is sending the
     Hello.

   Receiver ID
     This is the ID of a DCS from which the LS has heard a recent Hello.
     If the LS has not heard from any such DCS then the LS sets the
     "Number of Receiver IDs Heard" field to zero and allocates no
     storage for the Receiver ID in the Hello message.


B.2:  Packet Formats For NHRP

For NHRP, SCSP functionality may be obtained by using the SCSP packet
formats as described in Section B.1 (minus the LLC/SNAP and SCSP Fixed
part) by including them as the "mandatory part" part of an NHRP message
with the appropriate NHRP packet type code (described in Section B.2.1
through B.2.5).  This usage of SCSP is not the preferred method.
However, for consistency with previous revisions of SCSP, Sections B.2.1
through B.2.5 have been included below.

Section B.2.6 shows the correct format for the NHRP specific portion of
the CSA and CSAS records.

B.2.1 CA message as an NHRP mandatory part
   The NHRP CA packet has an SCSP CA message as its mandatory part and
   this NHRP packet has a type code of 11.

B.2.2 CSU Request message as an NHRP mandatory part
   The NHRP CSU Request packet has an SCSP CSU Request message as its
   mandatory part and this NHRP packet has a type code of 12.

B.2.3 CSU Reply message as an NHRP mandatory part
   The NHRP CSU Reply packet has an SCSP CSU Reply message as its
   mandatory part and this NHRP packet has a type code of 13.

B.2.4 CSU Solicit message as an NHRP mandatory part
   The NHRP CSU Solicit packet has an SCSP CSU Solicit message as its
   mandatory part and this NHRP packet has a type code of 14.




Luciani, et al.                                                [Page 28]


INTERNET-DRAFT                    SCSP                 Expires June 1997


B.2.5 Hello message as an NHRP mandatory part
   The NHRP Hello packet has an Hello message as its mandatory part and
   this NHRP packet has a type code of 15.

B.2.6 CSA record and CSAS record for NHRP

   The CSA record and CSAS record are protocol specific (e.g., NHRP,
   IPMC, ATMARP, etc.) because they carry protocol specific data.  This
   section describes the information carried in CSA records and CSAS
   records for NHRP.

B.2.6.1 CSA Record

   The Client State Advertisement (CSA) record contains the information
   necessary to relate the current state of a client to the servers
   being synchronized.  There are zero or more CSA records in an CSU
   Request message.  This section contains the NHRP specific portion of
   the CSA.  The NHRP specific portion of the CSA is made up of a Client
   Information Entry (CIE) as defined in [2] where the CIE Code field
   gives the "State" of the client and the previously unused field has
   been used as a "Flags" field which contains cache entry specific
   information which was registered with the server (see below for
   example).  Appended to the CIE is an "Other State" field which
   contains other information about the cache entry. The format of the
   NHRP specific part of the CSA record is as follows:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    State      | Prefix Length |         Flags                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Maximum Transmission Unit    |        Holding Time           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Cli Addr T/L | Cli SAddr T/L | Cli Proto Len |  Preference   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            Client NBMA Address (variable length)              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Client NBMA Subaddress (variable length)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Client Protocol Address (variable length)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               Other State  (variable length)                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   State
     This field contains a value which represents the change in state of
     the client.  For example:




Luciani, et al.                                                [Page 29]


INTERNET-DRAFT                    SCSP                 Expires June 1997


       0 - Client is registered and available.

       1 - Holding timer expired for client.

       2 - Client reregistered.

       3 - Client has been purged.

       4 - No such client data in server cache

   Prefix Length
     This field is message specific.  See the relevant message sections
     below.  In general, however, this fields is used to indicate that

   Flags
     Defined flags are as follows:

      0                   1
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |U|         unused              |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       U
         This is the Uniqueness bit.


   Maximum Transmission Unit
     This field gives the maximum transmission unit for the relevant
     client station.  If this value is 0 then either the default MTU is
     used or the MTU negotiated via signaling is used if such
     negotiation is possible for the given NBMA.

   Holding Time
     The Holding Time field specifies the number of seconds for which
     the Next Hop NBMA information specified in the CIE is considered to
     be valid.  Cached information SHALL be discarded when the holding
     time expires.  This field must be set to 0 on a NAK.

   Cli Addr T/L
     Type & length of next hop NBMA address specified in the CIE.  This
     field is interpreted in the context of the 'address family number'
     indicated by ar$afn (e.g., ar$afn=0x0003 for ATM).

   Cli SAddr T/L
     Type & length of next hop NBMA subaddress specified in the CIE.
     This field is interpreted in the context of the 'address family
     number' indicated by ar$afn (e.g., ar$afn=0x0015 for ATM makes the



Luciani, et al.                                                [Page 30]


INTERNET-DRAFT                    SCSP                 Expires June 1997


     address an E.164 and the subaddress an ATM Forum NSAP address).
     When an NBMA technology has no concept of a subaddress, the
     subaddress is always null with a length of 0.  When the address
     length is specified as 0 no storage is allocated for the address.

   Cli Proto Len
     This field holds the length in octets of the Client Protocol
     Address specified in the CIE.

   Preference
     This field specifies the preference for use of the specific CIE
     relative to other CIEs.  Higher values indicate higher preference.
     Action taken when multiple CIEs have equal or highest preference
     value is a local matter.

   Client NBMA Address
     This is the client's NBMA address.

   Client NBMA SubAddress
     This is the client's NBMA subaddress.

   Client Protocol Address
     This is the client's internetworking layer address specified.  This
     field is generically referred to in this document as the Client ID
     (CID).

   Other State
     At present, the other state record contains only the CSA Originator
     ID information and a place holder for Vendor private information:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |CSA Orig ID Len|      CSA Originator ID (variable length)      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |          Vendor Private Information (variable length)         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

     CSA Orig ID Len
       This field holds the length in octets of the CSA Originator ID.

     CSA Originator ID
       This field contains the protocol address of the server which
       originated the CSA record.

     Vendor Private Information
       This is a variable length octet string which is potentially
       vendor specific.  This may be encoded in a way similar to the



Luciani, et al.                                                [Page 31]


INTERNET-DRAFT                    SCSP                 Expires June 1997


       Vendor Private extension of [2].


B.2.6.2 Client State Advertisement Summary Record (CSAS record):

   The client state advertisement summary is a summarization of the CSA.
   A CSAS contains the following:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Cli Proto Len |CSA Orig ID Len|            unused             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Client Protocol Address (variable length)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     CSA Originator Protocol Address (variable length)         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Cli Proto Len
     This field holds the length in octets of the Client Protocol
     Address.

   CSA Orig ID Len
     This field holds the length in octets of the CSA Originator ID.

   Client Protocol Address
     This is the client's internetworking layer address specified.  This
     field is generically referred to in this document as the Client ID
     (CID).

   CSA Originator ID
     This field contains the protocol address of the server which
     originated the CSA record.


B.3  Packet Formats For ATMARP

   For ATMARP, SCSP functionality may be obtained by using the SCSP
   packet formats as described in Section B.1 (minus the LLC/SNAP and
   SCSP Fixed part) by including them part of an ATMARP message with the
   appropriate ATMARP packet type code (described in Section B.3.1
   through B.3.5).  This usage of SCSP is not the preferred method.
   However, for consistency with previous revisions of SCSP, Sections
   B.3.1 through B.3.5 have been included below.  When using this method
   to obtain SCSP functionality an ATMARP header/fixed-part needs to be
   appended to the SCSP packets which makes them look like every other
   ATMARP packet.  The format of that header is given below.  Consult
   Section 6.6 and 6.7 of [1] for more details.



Luciani, et al.                                                [Page 32]


INTERNET-DRAFT                    SCSP                 Expires June 1997


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           ar$hrd              |           ar$pro              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            unused             |           ar$op               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    ATMARP "mandatory parts"                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   ar$hrd
     The "Hardware type" is assigned to ATM Forum address family and is
     19 decimal (0x0013).

   ar$pro
     The "Protocol type" is (see Assigned Numbers) for protocol type
     number for the protocol using ATMARP. (IP is 0x0800).

   ar$op
     The operation type value is 3 for SCSP.

   ATMARP "mandatory parts"
     This part depends on the value of ar$op.  This part/field is
     analogous to the use of the mandatory part in NHRP while the
     preceding fields are directly analogous to the "Fixed" part in
     NHRP.  See Sections B.3.1 through B.3.5 for details of packet
     content based on the value in ar$op.

Section B.3.6 shows the correct format for the ATMARP specific portion
of the CSA and CSAS records.


B.3.1 CA message as an ATMARP mandatory part
   The ATMARP CA packet has an SCSP CA message as its mandatory part and
   this ATMARP packet has a ar$op value of 3.

B.3.2 CSU Request message as an ATMARP mandatory part
   The ATMARP CSU Request packet has an SCSP CSU Request message as its
   mandatory part and this ATMARP packet has a ar$op value of 4.  For
   ATMARP, since ATMARP clients have no concept of a sequence number,
   SCSP must generate a sequence number for each client request which
   causes a database update to occur since SCSP cannot acquire a unique
   sequence number from the client for the given update.

B.3.3 CSU Reply message as an ATMARP mandatory part



Luciani, et al.                                                [Page 33]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   The ATMARP CSU Reply packet has an SCSP CSU Reply message as its
   mandatory part and this ATMARP packet has a ar$op value of 5.

B.3.4 CSU Solicit message as an ATMARP mandatory part
   The ATMARP CSU Solicit packet has an SCSP CSU Solicit message as its
   mandatory part and this ATMARP packet has a ar$op value of 6.

B.3.5 Hello message as an ATMARP mandatory part
   The ATMARP Hello packet has an Hello message as its mandatory part
   and this ATMARP packet has a ar$op value of 7.

B.3.6 CSA record and CSAS record for ATMARP

   These records are the same as those found in Sections B.2.6.1 and
   B.2.6.2 of this document with several exceptions:

     1) The Holding Time is always set to 1200 seconds.

     2) The "Cli NBMA T/L" and "Cli NBMA SubT/L" fields are coded in a
     manner similar to ar$sstl and ar$shtl respectively as seen in
     Section 6.6 of [1].

     3) Prefix length is always set to 0xff.

     4) Preference is always set to zero.


Appendix C:  A Canonical Point Of Query

The following sections of this appendix describe optional Designated
Server (DS) functionality which is not completely within the realm of
server synchronization but is closely related. One use of this
Designated Server functionality might be to have a dynamically elected
server be responsible for assigning CMIs [5] to clients in an IPMC
implementation.

One way to obtain a Designated Server is described below while another
may be simply run a spanning tree like protocol over all servers in a SG
and choose the root server as the Designated Server.

CSU messages are used to elect the "Designated" Server (DS) from the set
of "Eligible" Servers (ESs).  A server must also be configured with its
Designated Server Priority (DSP) which relates its priority in the
election of a DS.  An ES is a server that is eligible to become the DS
by virtue of the fact that it has a DSP which is greater than zero.

C.1 Additional Abbreviations




Luciani, et al.                                                [Page 34]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   DS - Designated Server

   DSID - Designated Server ID

   DSP - Designated Server Priority

   ES - Eligible Server

C.2 Additional Definitions

   Designated Server (DS)
     The DS is the contact point within the SG for off-SG stations
     wishing to query the state of the SG.

   Designated Server ID (DSID)
     The DSID is a unique token that identifies the DS in an SG.  This value
     might be taken from the protocol address of the DS.

   Designated Server Priority (DSP)
     The DSP identifies the priority of a given server to become
     the DS.  If the DSP is 0 then the server is ineligible to become
     the DS.

   Eligible Server (ES)
     An ES is a server that is eligible to become the DS as a result
     of having a DSP greater than zero.

C.3 The Designated Server Functionality

   The remainder of this section assumes that the canonical point of
   query functionality is to be implemented.

C.3.1 Overview

   When an LS has one or more CAFSMs in the Aligned State, the LS
   participates in the Designated Server (DS) election process for the
   given SG.  Once a CAFSM has reached the Aligned State, the LS starts
   the DSTimer which is set to DSInitTime.  Before this DSTimer expires,
   the LS MUST not include a Preferred DSID or Preferred DSP in the CSU
   messages it originates.  While the DSTimer is running, the LS keeps
   track of its preferred DS from knowledge contained in its cache and
   from knowledge of its own DS Priority (DSP) and LSID.  The preferred
   DS is the server with the highest DSP and in the case of a tie, the
   largest Server ID (SID) wins.  CSU messages contain CSA records. Each
   CSA contains the following additional fields: a DS bit (which
   proclaims that the originator believes that it is DS) and a C/S bit
   (which proclaims that the cache entry refers to a Client (bit is
   zero) or a Server (bit is set to one)).  Further, if the C/S bit is



Luciani, et al.                                                [Page 35]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   set then the CSA also contains a Preferred DSID field and a Preferred
   DSP field. Note that clients are assumed to have a DSP of zero.
   Servers are clients of themselves in the sense of keeping their own
   state in their own cache; thus a server always advertises itself.

C.3.2 The Election Algorithm

   When the DSTimer expires the LS chooses its preferred DS and starts
   advertising it as well as the preferred DSP.  The LS then does the
   following:
   1) If the LS thinks that it is the preferred DS then
      a) If all known servers have chosen this LS as leader then
         the LS becomes the DS (see below)
      b) If one or more servers are advertising a different DS from the LS then
         1) Start the DSOverrideTimer with DSOverrideInterval in it
         2) When the DSOverrideTimer expires
            a) If 2/3 of the servers believe the LS to be leader then
               the LS becomes the DS (see below)

   2) If the LS becomes DS it does the following:
      a) It increases its DSP by DSPIncrement or to DSPMax whichever is least
      b) It sends out a CSU message with its new DSP in Preferred DSP field,
         its LSID in the preferred DSID field, the DS bit set, the
         Originator ID field set to its LSID, and the Originator DSP field set
         to its new DSP.

   3) At all times an LS is listening for a new DS with higher DSP then
      the current preferred DSP (and preferred DSID).

   If at any time the LS sees a DSP higher then the preferred DSP or a
   DSP which is equal to the current preferred DSP but with an
   associated DSID which is larger than the preferred DSID then the LS
   acts as follows:
   1) If the LS was the DS then
      a) The LS announces that the other server is the DS by sending out a CSU
         message with the new DS's DSID in the preferred DSID, with the new
         DS's DSP in the preferred DSP field, the DS bit set off,
         and the Originator DSP field set to its original DSP (not its
         incremented DSP).
      b) The LS sets its DSP to its original value.

   2) If the LS was not the DS then
      a) If the new preferred DS is not the LS then
         the LS simply advertises the new information pertaining to the new DS
      b) If the new preferred DS is the LS then
         restart the election process as if the DSTimer had just expired.

   If the LS loses "connectivity" with the DS (e.g., the cache entry in



Luciani, et al.                                                [Page 36]


INTERNET-DRAFT                    SCSP                 Expires June 1997


   the LS for the DS is removed) then the LS acts as follows:
   1) The LS starts a Re-electionTimer
      a) If connectivity is reestablished before the timer expires then
         stop the timer and continue as normal
      b) else restart the election process as if the DSTimer had just expired

   If at any time the last CAFSM of the LS for the given SG leaves the
   Aligned State then all memory of the DS for that SG is erased from
   the LS and re-election will not take place until at least one CAFSM
   of the LS for the given SG reaches the Aligned State at which point
   the election process will start from the beginning.

C.3.3  Message Additions

   A CSU message carries 0 or more CSA records.  When designated server
   functionality is used, CSA records have the following fields appended
   to them:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | DS Proto Len  |   Pref DSP    |            unused             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            Preferred Designated NHS Protocol Address          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   DS Proto Len
     This field holds the length in octets of the Preferred Designated
     NHS's Protocol Address.

   Pref DSP
     This field contains the priority of the preferred Designated NHS as
     seen from the perspective of the server creating the CSA record.
     This field does not exist in a record when the C/S bit is zero.

   Preferred DSID
     This field contains the ID of the preferred designated as seen from
     the perspective of the server creating the CSA record.  This field
     does not exist in a record when the C/S bit is zero.



References

[1] "Classical IP and ARP over ATM", Laubach, RFC 1577.

[2] "NBMA Next Hop Resolution Protocol (NHRP)", Luciani, Katz, Piscitello,
    Cole, draft-ietf-rolc-nhrp-08.txt.



Luciani, et al.                                                [Page 37]


INTERNET-DRAFT                    SCSP                 Expires June 1997


[3] "OSPF Version 2", Moy, RFC1583.

[4] "PNNI Draft Specification", Dykeman, Goguen, ATM Forum 94-0471R16
    (Straw Vote), 1996.

[5] "Support for Multicast over UNI 3.0/3.1 based ATM Networks.",
    Armitage, draft-ietf-ipatm-ipmc-12.txt.

[6] LAN Emulation over ATM Version 2 - LNNI specification - Draft 3
    ATM Forum 95-1082R3, April 1996

[7] Assigned Numbers, J. Reynolds and J. Postel, RFC 1700.


Acknowledgments

   This I-D is a distillation of issues raised during private
   discussions, on the IP-ATM mailing list, and during the Dallas IETF
   (12/95). Thanks to all who have contributed but particular thanks to
   Andy Malis, Raj Nair, and Matthew Doar of Ascom Nexion.  I would also
   like to thank James Watt of Newbridge for comments that lead to a
   tighter document.

Author's Address

   James V. Luciani
   Bay Networks, Inc.
   3 Federal Street, BL3-04
   Billerica, MA  01821
   phone: +1-508-439-4734
   email: luciani@baynetworks.com

   Grenville Armitage
   Bellcore, 445 South Street
   Morristown, NJ, 07960
   Email: gja@thumper.bellcore.com
   Ph. +1 201 829 2635

   Joel M. Halpern
   Newbridge Networks Corp.
   593 Herndon Parkway
   Herndon, VA 22070-5241
   Phone: +1-703-708-5954
   Email: jhalpern@Newbridge.COM







Luciani, et al.                                                [Page 38]