Multi-party Chat Using the Message Session Relay Protocol (MSRP)
draft-ietf-simple-chat-13
The information below is for an old version of the document.
Document | Type |
This is an older version of an Internet-Draft that was ultimately published as RFC 7701.
|
|
---|---|---|---|
Authors | Miguel Angel García , Geir Sandbakken , Aki Niemi | ||
Last updated | 2012-02-13 (Latest revision 2012-01-23) | ||
Replaces | draft-niemi-simple-chat | ||
RFC stream | Internet Engineering Task Force (IETF) | ||
Formats | |||
Reviews |
SECDIR Telechat review
by Vincent Roca
Ready w/nits
|
||
Additional resources | Mailing list discussion | ||
Stream | WG state | WG Document | |
Document shepherd | (None) | ||
IESG | IESG state | Became RFC 7701 (Proposed Standard) | |
Consensus boilerplate | Unknown | ||
Telechat date | (None) | ||
Responsible AD | Gonzalo Camarillo | ||
IESG note | ** No value found for 'doc.notedoc.note' ** | ||
Send notices to | simple-chairs@tools.ietf.org, draft-ietf-simple-chat@tools.ietf.org |
draft-ietf-simple-chat-13
Network Working Group A. Niemi Internet-Draft Nokia Intended status: Standards Track M. Garcia-Martin Expires: July 26, 2012 Ericsson G. Sandbakken, Ed. Cisco Systems January 23, 2012 Multi-party Chat Using the Message Session Relay Protocol (MSRP) draft-ietf-simple-chat-13 Abstract The Message Session Relay Protocol (MSRP) defines a mechanism for sending instant messages within a peer-to-peer session, negotiated using the Session Initiation Protocol (SIP) and the Session Description Protocol (SDP). This document defines the necessary tools for establishing multi-party chat sessions, or chat rooms, using MSRP. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on July 26, 2012. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust&RFC 3530 NFS version 4 Protocol April 2003 - The timestamp of when the NFS version 4 software was first installed on the client (though this is subject to the previously mentioned caution about using information that is stored in a file, because the file might only be accessible over NFS version 4). - A true random number. However since this number ought to be the same between client incarnations, this shares the same problem as that of the using the timestamp of the software installation. As a security measure, the server MUST NOT cancel a client's leased state if the principal established the state for a given id string is not the same as the principal issuing the SETCLIENTID. Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose of establishing the information the server needs to make callbacks to the client for purpose of supporting delegations. It is permitted to change this information via SETCLIENTID and SETCLIENTID_CONFIRM within the same incarnation of the client without removing the client's leased state. Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully completed, the client uses the shorthand client identifier, of type clientid4, instead of the longer and less compact nfs_client_id4 structure. This shorthand client identifier (a clientid) is assigned by the server and should be chosen so that it will not conflict with a clientid previously assigned by the server. This applies across server restarts or reboots. When a clientid is presented to a server and that clientid is not recognized, as would happen after a server reboot, the server will reject the request with the error NFS4ERR_STALE_CLIENTID. When this happens, the client must obtain a new clientid by use of the SETCLIENTID operation and then proceed to any other necessary recovery for the server reboot case (See the section "Server Failure and Recovery"). The client must also employ the SETCLIENTID operation when it receives a NFS4ERR_STALE_STATEID error using a stateid derived from its current clientid, since this also indicates a server reboot which has invalidated the existing clientid (see the next section "lock_owner and stateid Definition" for details). See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM for a complete specification of the operations. Shepler, et al. Standards Track [Page 68] RFC 3530 NFS version 4 Protocol April 2003 8.1.2. Server Release of Clientid If the server determines that the client holds no associated state for its clientid, the server may choose to release the clientid. The server may make this choice for an inactive client so that resources are not consumed by those intermittently active clients. If the client contacts the server after this release, the server must ensure the client receives the appropriate error so that it will use the SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. It should be clear that the server must be very hesitant to release a clientid since the resulting work on the client to recover from such an event will be the same burden as if the server had failed and restarted. Typically a server would not release a clientid unless there had been no activity from that client for many minutes. Note that if the id string in a SETCLIENTID request is properly constructed, and if the client takes care to use the same principal for each successive use of SETCLIENTID, then, barring an active denial of service attack, NFS4ERR_CLID_INUSE should never be returned. However, client bugs, server bugs, or perhaps a deliberate change of the principal owner of the id string (such as the case of a client that changes security flavors, and under the new flavor, there is no mapping to the previous owner) will in rare cases result in NFS4ERR_CLID_INUSE. In that event, when the server gets a SETCLIENTID for a client id that currently has no state, or it has state, but the lease has expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST allow the SETCLIENTID, and confirm the new clientid if followed by the appropriate SETCLIENTID_CONFIRM. 8.1.3. lock_owner and stateid Definition When requesting a lock, the client must present to the server the clientid and an identifier for the owner of the requested lock. These two fields are referred to as the lock_owner and the definition of those fields are: o A clientid returned by the server as part of the client's use of the SETCLIENTID operation. o A variable length opaque array used to uniquely define the owner of a lock managed by the client. This may be a thread id, process id, or other unique value. Shepler, et al. Standards Track [Page 69] RFC 3530 NFS version 4 Protocol April 2003 When the server grants the lock, it responds with a unique stateid. The stateid is used as a shorthand reference to the lock_owner, since the server will be maintaining the correspondence between them. The server is free to form the stateid in any manner that it chooses as long as it is able to recognize invalid and out-of-date stateids. This requirement includes those stateids generated by earlier instances of the server. From this, the client can be properly notified of a server restart. This notification will occur when the client presents a stateid to the server from a previous instantiation. The server must be able to distinguish the following situations and return the error as specified: o The stateid was generated by an earlier server instance (i.e., before a server reboot). The error NFS4ERR_STALE_STATEID should be returned. o The stateid was generated by the current server instance but the stateid no longer designates the current locking state for the lockowner-file pair in question (i.e., one or more locking operations has occurred). The error NFS4ERR_OLD_STATEID should be returned. This error condition will only occur when the client issues a locking request which changes a stateid while an I/O request that uses that stateid is outstanding. o The stateid was generated by the current server instance but the stateid does not designate a locking state for any active lockowner-file pair. The error NFS4ERR_BAD_STATEID should be returned. This error condition will occur when there has been a logic error on the part of the client or server. This should not happen. One mechanism that may be used to satisfy these requirements is for the server to, o divide the "other" field of each stateid into two fields: - A server verifier which uniquely designates a particular server instantiation. - An index into a table of locking-state structures. Shepler, et al. Standards Track [Page 70] RFC 3530 NFS version 4 Protocol April 2003 o utilize the "seqid" field of each stateid, such that seqid is monotonically incremented for each stateid that is associated with the same index into the locking-state table. By matching the incoming stateid and its field values with the state held at the server, the server is able to easily determine if a stateid is valid for its current instantiation and state. If the stateid is not valid, the appropriate error can be supplied to the client. 8.1.4. Use of the stateid and Locking All READ, WRITE and SETATTR operations contain a stateid. For the purposes of this section, SETATTR operations which change the size attribute of a file are treated as if they are writing the area between the old and new size (i.e., the range truncated or added to the file by means of the SETATTR), even where SETATTR is not explicitly mentioned in the text. If the lock_owner performs a READ or WRITE in a situation in which it has established a lock or share reservation on the server (any OPEN constitutes a share reservation) the stateid (previously returned by the server) must be used to indicate what locks, including both record locks and share reservations, are held by the lockowner. If no state is established by the client, either record lock or share reservation, a stateid of all bits 0 is used. Regardless whether a stateid of all bits 0, or a stateid returned by the server is used, if there is a conflicting share reservation or mandatory record lock held on the file, the server MUST refuse to service the READ or WRITE operation. Share reservations are established by OPEN operations and by their nature are mandatory in that when the OPEN denies READ or WRITE operations, that denial results in such operations being rejected with error NFS4ERR_LOCKED. Record locks may be implemented by the server as either mandatory or advisory, or the choice of mandatory or advisory behavior may be determined by the server on the basis of the file being accessed (for example, some UNIX-based servers support a "mandatory lock bit" on the mode attribute such that if set, record locks are required on the file before I/O is possible). When record locks are advisory, they only prevent the granting of conflicting lock requests and have no effect on READs or WRITEs. Mandatory record locks, however, prevent conflicting I/O operations. When they are attempted, they are rejected with NFS4ERR_LOCKED. When the client gets NFS4ERR_LOCKED on a file it knows it has the proper share reservation for, it will need to issue a LOCK request on the region Shepler, et al. Standards Track [Page 71] RFC 3530 NFS version 4 Protocol April 2003 of the file that includes the region the I/O was to be performed on, with an appropriate locktype (i.e., READ*_LT for a READ operation, WRITE*_LT for a WRITE operation). With NFS version 3, there was no notion of a stateid so there was no way to tell if the application process of the client sending the READ or WRITE operation had also acquired the appropriate record lock on the file. Thus there was no way to implement mandatory locking. With the stateid construct, this barrier has been removed. Note that for UNIX environments that support mandatory file locking, the distinction between advisory and mandatory locking is subtle. In fact, advisory and mandatory record locks are exactly the same in so far as the APIs and requirements on implementation. If the mandatory lock attribute is set on the file, the server checks to see if the lockowner has an appropriate shared (read) or exclusive (write) record lock on the region it wishes to read or write to. If there is no appropriate lock, the server checks if there is a conflicting lock (which can be done by attempting to acquire the conflicting lock on the behalf of the lockowner, and if successful, release the lock after the READ or WRITE is done), and if there is, the server returns NFS4ERR_LOCKED. For Windows environments, there are no advisory record locks, so the server always checks for record locks during I/O requests. Thus, the NFS version 4 LOCK operation does not need to distinguish between advisory and mandatory record locks. It is the NFS version 4 server's processing of the READ and WRITE operations that introduces the distinction. Every stateid other than the special stateid values noted in this section, whether returned by an OPEN-type operation (i.e., OPEN, OPEN_DOWNGRADE), or by a LOCK-type operation (i.e., LOCK or LOCKU), defines an access mode for the file (i.e., READ, WRITE, or READ- WRITE) as established by the original OPEN which began the stateid sequence, and as modified by subsequent OPENs and OPEN_DOWNGRADEs within that stateid sequence. When a READ, WRITE, or SETATTR which specifies the size attribute, is done, the operation is subject to checking against the access mode to verify that the operation is appropriate given the OPEN with which the operation is associated. In the case of WRITE-type operations (i.e., WRITEs and SETATTRs which set size), the server must verify that the access mode allows writing and return an NFS4ERR_OPENMODE error if it does not. In the case, of READ, the server may perform the corresponding check on the access mode, or it may choose to allow READ on opens for WRITE only, to accommodate clients whose write implementation may unavoidably do Shepler, et al. Standards Track [Page 72] RFC 3530 NFS version 4 Protocol April 2003 reads (e.g., due to buffer cache constraints). However, even if READs are allowed in these circumstances, the server MUST still check for locks that conflict with the READ (e.g., another open specify denial of READs). Note that a server which does enforce the access mode check on READs need not explicitly check for conflicting share reservations since the existence of OPEN for read access guarantees that no conflicting share reservation can exist. A stateid of all bits 1 (one) MAY allow READ operations to bypass locking checks at the server. However, WRITE operations with a stateid with bits all 1 (one) MUST NOT bypass locking checks and are treated exactly the same as if a stateid of all bits 0 were used. A lock may not be granted while a READ or WRITE operation using one of the special stateids is being performed and the range of the lock request conflicts with the range of the READ or WRITE operation. For the purposes of this paragraph, a conflict occurs when a shared lock is requested and a WRITE operation is being performed, or an exclusive lock is requested and either a READ or a WRITE operation is being performed. A SETATTR that sets size is treated similarly to a WRITE as discussed above. 8.1.5. Sequencing of Lock Requests Locking is different than most NFS operations as it requires "at- most-one" semantics that are not provided by ONCRPC. ONCRPC over a reliable transport is not sufficient because a sequence of locking requests may span multiple TCP connections. In the face of retransmission or reordering, lock or unlock requests must have a well defined and consistent behavior. To accomplish this, each lock request contains a sequence number that is a consecutively increasing integer. Different lock_owners have different sequences. The server maintains the last sequence number (L) received and the response that was returned. The first request issued for any given lock_owner is issued with a sequence number of zero. Note that for requests that contain a sequence number, for each lock_owner, there should be no more than one outstanding request. If a request (r) with a previous sequence number (r < L) is received, it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a properly-functioning client, the response to (r) must have been received before the last request (L) was sent. If a duplicate of last request (r == L) is received, the stored response is returned. If a request beyond the next sequence (r == L + 2) is received, it is rejected with the return of error NFS4ERR_BAD_SEQID. Sequence history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM sequence changes the client verifier. Shepler, et al. Standards Track [Page 73] RFC 3530 NFS version 4 Protocol April 2003 Since the sequence number is represented with an unsigned 32-bit integer, the arithmetic involved with the sequence number is mod 2^32. For an example of modulo arithmetic involving sequence numbers see [RFC793]. It is critical the server maintain the last response sent to the client to provide a more reliable cache of duplicate non-idempotent requests than that of the traditional cache described in [Juszczak]. The traditional duplicate request cache uses a least recently used algorithm for removing unneeded requests. However, the last lock request and response on a given lock_owner must be cached as long as the lock state exists on the server. The client MUST monotonically increment the sequence number for the CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE operations. This is true even in the event that the previous operation that used the sequence number received an error. The only exception to this rule is if the previous operation received one of the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR, NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE. 8.1.6. Recovery from Replayed Requests As described above, the sequence number is per lock_owner. As long as the server maintains the last sequence number received and follows the methods described above, there are no risks of a Byzantine router re-sending old requests. The server need only maintain the (lock_owner, sequence number) state as long as there are open files or closed files with locks outstanding. LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence number and therefore the risk of the replay of these operations resulting in undesired effects is non-existent while the server maintains the lock_owner state. 8.1.7. Releasing lock_owner State When a particular lock_owner no longer holds open or file locking state at the server, the server may choose to release the sequence number state associated with the lock_owner. The server may make this choice based on lease expiration, for the reclamation of server memory, or other implementation specific details. In any event, the server is able to do this safely only when the lock_owner no longer is being utilized by the client. The server may choose to hold the lock_owner state in the event that retransmitted requests are received. However, the period to hold this state is implementation specific. Shepler, et al. Standards Track [Page 74] RFC 3530 NFS version 4 Protocol April 2003 In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is retransmitted after the server has previously released the lock_owner state, the server will find that the lock_owner has no files open and an error will be returned to the client. If the lock_owner does have a file open, the stateid will not match and again an error is returned to the client. 8.1.8. Use of Open Confirmation In the case that an OPEN is retransmitted and the lock_owner is being used for the first time or the lock_owner state has been previously released by the server, the use of the OPEN_CONFIRM operation will prevent incorrect behavior. When the server observes the use of the lock_owner for the first time, it will direct the client to perform the OPEN_CONFIRM for the corresponding OPEN. This sequence establishes the use of an lock_owner and associated sequence number. Since the OPEN_CONFIRM sequence connects a new open_owner on the server with an existing open_owner on a client, the sequence number may have any value. The OPEN_CONFIRM step assures the server that the value received is the correct one. See the section "OPEN_CONFIRM - Confirm Open" for further details. There are a number of situations in which the requirement to confirm an OPEN would pose difficulties for the client and server, in that they would be prevented from acting in a timely fashion on information received, because that information would be provisional, subject to deletion upon non-confirmation. Fortunately, these are situations in which the server can avoid the need for confirmation when responding to open requests. The two constraints are: o The server must not bestow a delegation for any open which would require confirmation. o The server MUST NOT require confirmation on a reclaim-type open (i.e., one specifying claim type CLAIM_PREVIOUS or CLAIM_DELEGATE_PREV). These constraints are related in that reclaim-type opens are the only ones in which the server may be required to send a delegation. For CLAIM_NULL, sending the delegation is optional while for CLAIM_DELEGATE_CUR, no delegation is sent. Delegations being sent with an open requiring confirmation are troublesome because recovering from non-confirmation adds undue complexity to the protocol while requiring confirmation on reclaim- type opens poses difficulties in that the inability to resolve Shepler, et al. Standards Track [Page 75] RFC 3530 NFS version 4 Protocol April 2003 the status of the reclaim until lease expiration may make it difficult to have timely determination of the set of locks being reclaimed (since the grace period may expire). Requiring open confirmation on reclaim-type opens is avoidable because of the nature of the environments in which such opens are done. For CLAIM_PREVIOUS opens, this is immediately after server reboot, so there should be no time for lockowners to be created, found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we are dealing with a client reboot situation. A server which supports delegation can be sure that no lockowners for that client have been recycled since client initialization and thus can ensure that confirmation will not be required. 8.2. Lock Ranges The protocol allows a lock owner to request a lock with a byte range and then either upgrade or unlock a sub-range of the initial lock. It is expected that this will be an uncommon type of request. In any case, servers or server filesystems may not be able to support sub- range lock semantics. In the event that a server receives a locking request that represents a sub-range of current locking state for the lock owner, the server is allowed to return the error NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock operations. Therefore, the client should be prepared to receive this error and, if appropriate, report the error to the requesting application. The client is discouraged from combining multiple independent locking ranges that happen to be adjacent into a single request since the server may not support sub-range requests and for reasons related to the recovery of file locking state in the event of server failure. As discussed in the section "Server Failure and Recovery" below, the server may employ certain optimizations during recovery that work effectively only when the client's behavior during lock recovery is similar to the client's locking behavior prior to server failure. 8.3. Upgrading and Downgrading Locks If a client has a write lock on a record, it can request an atomic downgrade of the lock to a read lock via the LOCK request, by setting the type to READ_LT. If the server supports atomic downgrade, the request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. The client should be prepared to receive this error, and if appropriate, report the error to the requesting application. Shepler, et al. Standards Track [Page 76] RFC 3530 NFS version 4 Protocol April 2003 If a client has a read lock on a record, it can request an atomic upgrade of the lock to a write lock via the LOCK request by setting the type to WRITE_LT or WRITEW_LT. If the server does not support atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade can be achieved without an existing conflict, the request will succeed. Otherwise, the server will return either NFS4ERR_DENIED or NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the client issued the LOCK request with the type set to WRITEW_LT and the server has detected a deadlock. The client should be prepared to receive such errors and if appropriate, report the error to the requesting application. 8.4. Blocking Locks Some clients require the support of blocking locks. The NFS version 4 protocol must not rely on a callback mechanism and therefore is unable to notify a client when a previously denied lock has been granted. Clients have no choice but to continually poll for the lock. This presents a fairness problem. Two new lock types are added, READW and WRITEW, and are used to indicate to the server that the client is requesting a blocking lock. The server should maintain an ordered list of pending blocking locks. When the conflicting lock is released, the server may wait the lease period for the first waiting client to re-request the lock. After the lease period expires the next waiting client request is allowed the lock. Clients are required to poll at an interval sufficiently small that it is likely to acquire the lock in a timely manner. The server is not required to maintain a list of pending blocked locks as it is used to increase fairness and not correct operation. Because of the unordered nature of crash recovery, storing of lock state to stable storage would be required to guarantee ordered granting of blocking locks. Servers may also note the lock types and delay returning denial of the request to allow extra time for a conflicting lock to be released, allowing a successful return. In this way, clients can avoid the burden of needlessly frequent polling for blocking locks. The server should take care in the length of delay in the event the client retransmits the request. 8.5. Lease Renewal The purpose of a lease is to allow a server to remove stale locks that are held by a client that has crashed or is otherwise unreachable. It is not a mechanism for cache consistency and lease renewals may not be denied if the lease interval has not expired. Shepler, et al. Standards Track [Page 77] RFC 3530 NFS version 4 Protocol April 2003 #x27;s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect Niemi, et al. Expires July 26, 2012 [Page 1] Internet-Draft Multi-party Chat MSRP January 2012 to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Niemi, et al. Expires July 26, 2012 [Page 2] Internet-Draft Multi-party Chat MSRP January 2012 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Motivations and Requirements . . . . . . . . . . . . . . . . . 6 4. Overview of Operation . . . . . . . . . . . . . . . . . . . . 7 5. Creating, Joining, and Deleting a Chat Room . . . . . . . . . 10 5.1. Creating a Chat Room . . . . . . . . . . . . . . . . . . . 10 5.2. Joining a Chat Room . . . . . . . . . . . . . . . . . . . 10 5.3. Deleting a Chat Room . . . . . . . . . . . . . . . . . . . 11 6. Sending and Receiving Instant Messages . . . . . . . . . . . . 12 6.1. Regular Messages . . . . . . . . . . . . . . . . . . . . . 12 6.2. Private Messages . . . . . . . . . . . . . . . . . . . . . 13 6.3. MSRP reports and responses . . . . . . . . . . . . . . . . 15 7. Nicknames . . . . . . . . . . . . . . . . . . . . . . . . . . 16 7.1. Using Nicknames within a Conference . . . . . . . . . . . 16 7.2. Modifying a Nickname . . . . . . . . . . . . . . . . . . . 18 7.3. Removing a Nickname . . . . . . . . . . . . . . . . . . . 18 7.4. Nicknames in Conference Event Packages . . . . . . . . . . 18 8. The SDP 'chatroom' attribute . . . . . . . . . . . . . . . . . 18 9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 9.1. Joining a chat room . . . . . . . . . . . . . . . . . . . 21 9.2. Setting up a nickname . . . . . . . . . . . . . . . . . . 23 9.3. Sending a regular message to the chat room . . . . . . . . 24 9.4. Sending a private message to a participant . . . . . . . . 26 9.5. Chunked private message . . . . . . . . . . . . . . . . . 27 9.6. Nickname in a conference information document . . . . . . 28 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 10.1. New MSRP Method . . . . . . . . . . . . . . . . . . . . . 29 10.2. New MSRP Header . . . . . . . . . . . . . . . . . . . . . 30 10.3. New MSRP Status Codes . . . . . . . . . . . . . . . . . . 30 10.4. New SDP Attribute . . . . . . . . . . . . . . . . . . . . 30 11. Security Considerations . . . . . . . . . . . . . . . . . . . 31 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 32 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 32 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 14.1. Normative References . . . . . . . . . . . . . . . . . . . 32 14.2. Informative References . . . . . . . . . . . . . . . . . . 34 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34 Niemi, et al. Expires July 26, 2012 [Page 3] Internet-Draft Multi-party Chat MSRP January 2012 1. Introduction The Message Session Relay Protocol (MSRP) [RFC4975] defines a mechanism for sending a series of instant messages within a session. The Session Initiation Protocol (SIP) [RFC3261] in combination with the Session Description Protocol (SDP) [RFC4566] allows for two peers to establish and manage such sessions. In another application of SIP, a user agent can join in a multi-party conversation called a conference that is hosted by a specialized user agent called a focus [RFC4353]. Such a conference can naturally involve MSRP sessions. It is the responsibility of an entity handling the media to relay instant messages received from one participant to the rest of the participants in the conference. Several such systems already exist in the Internet. Participants in a chat room can be identified with a pseudonym or nickname, and decide whether their real identifier is disclosed to other participants. Participants can also use a rich set of features such as the ability to send private instant messages to other participants. Similar conferences supporting chat rooms are already available today. For example, Internet Relay Chat (IRC) [RFC2810], Extensible Messaging and Presence Protocol (XMPP): Core [RFC6120] based chat rooms, and many other proprietary systems provide chat room functionality. Specifying equivalent functionality for MSRP-based systems provides competitive features and enables interworking between the systems. This document defines requirements, conventions, and extensions for providing private messages and nickname management in centralized conferences with MSRP. Participants in a chat room can be identified by a pseudonym, and decide if their real identifier is disclosed to other participants. This memo uses the SIP Conferencing Framework [RFC4353] as a design basis. It also aims to be compatible with the A Framework for Centralized Conferencing [RFC5239]. Should requirements arise, future mechanisms for providing similar functionality in generic conferences might be developed, for example, where the media is not only restricted to MSRP. The mechanisms described in this document provide a future compatible short-term solution for MSRP centralized conferences. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this Niemi, et al. Expires July 26, 2012 [Page 4] Internet-Draft Multi-party Chat MSRP January 2012 document are to be interpreted as described in RFC 2119, BCP 14 [RFC2119], and indicate requirement levels for compliant implementations. This memo deals with tightly coupled SIP conferences defined in SIP Conferencing Framework [RFC4353] and adopts the terminology from that document. In addition to that terminology, we introduce some new terms: Nickname: a pseudonym or descriptive name associated to a participant. See Section 7 for details Multi-party chat: an instance of a tightly coupled conference, in which the media exchanged between the participants consist of MSRP based instant messages. Also known as a chat room. Chat Room: a synonym for a multi-party chat. Chat Room URI: a URI that identifies a particular chat room, and is a synonym of a Conference URI defined in RFC 4353 [RFC4353]. Sender: the conference participant that originally created an instant message and sent it to the chat room for delivery. Recipient: the destination conference participant(s). This defaults to the full conference participant list, minus the IM Sender. MSRP switch: a media level entity that is a MSRP endpoint. It is a special MSRP endpoint that receives MSRP messages, and delivers them to the other conference participants. The MSRP switch has a similar role to a conference mixer with the exception that the MSRP switch does not actually "mix" together different input media streams; it merely relays the messages between participants. Private Instant Message: an instant message sent in a chat room intended for a single participant. A private IM is usually rendered distinctly from the rest of the IMs, indicating that the message was a private communication. Anonymous URI: a URI concealing the participant's SIP AOR from the other participants in the conference. The allocation of such a URI is out of scope of this specification. An anonymous URI must be valid for the length of the conference, and will be utilized by the MSRP switch to forward messages to and from anonymous participants. Niemi, et al. Expires July 26, 2012 [Page 5] Internet-Draft Multi-party Chat MSRP January 2012 Conference Event Package: a notification mechanism that allows conference participants to learn conference information including roster and state changes in a conference. This would typically be A Session Initiation Protocol (SIP) Event Package for Conference State [RFC4575] or Conference Event Package Data Format Extension for Centralized Conferencing [I-D.ietf-xcon-event-package]. 3. Motivations and Requirements Although conference frameworks describing many types of conferencing applications already exist, such as the Framework for Centralized Conferencing [RFC5239] and the SIP Conferencing Framework [RFC4353], the exact details of session-based instant messaging conferences are not well-defined at the moment. To allow interoperable chat implementations, for both conference- aware, and conference-unaware user agents, certain conventions for MSRP conferences need to be defined. It also seems beneficial to provide a set of features that enhance the baseline multi-party MSRP in order to be able to create systems that have functionality on par with existing chat systems, as well as enable building interworking gateways to these existing chat systems. We define the following requirements: REQ-1: A basic requirement is the existence of a multi-party conference, where participants can join and leave the conference and get instant messages exchanged to the rest of the participants. REQ-2: A conference participant must be able to determine the identifiers of the sender and recipient of the received IMs. Note that the actual identifiers depend no those which were selected by the sender or recipient when he or she joined the conference. REQ-3: A conference participant must be able to determine the recipient of the received message. For instance, the recipient of the message might be the entire conference or a single participant of the conference (i.e., a private message). REQ-4: It must be possible to send a message to a single participant within the conference (i.e., a private instant message). Niemi, et al. Expires July 26, 2012 [Page 6] Internet-Draft Multi-party Chat MSRP January 2012 REQ-5: A conference participant may have a nickname or pseudonym associated with their real identifier. REQ-6: It must be possible for a participant to change their nickname during the progress of the conference. REQ-7: It must be possible that a participant is only known by an anonymous identifier and not their real identifier to the rest of the conference. REQ-8: It must be possible for the conference participants to learn the chat room capabilities described in this document. 4. Overview of Operation In order to set up a conference, one must first be created. Users wishing to host a conference themselves can of course do just that; their User Agent (UA) simply morphs from an ordinary UA into a special purpose one called a Focus UA. Another, commonly used setup is one where a dedicated node in the network functions as a Focus UA. Each chat room has an identifier of its own: a SIP URI that participants use to join the conference, e.g. by sending an INVITE request. The conference focus processes the invitations, and as such, maintains SIP dialogs with each participant. In a multi-party chat, or chat room, MSRP is one of the established media streams. Each conference participant establishes an MSRP session with the MSRP switch, which is a special purpose MSRP application. The MSRP sessions can be relayed by one or more MSRP relays, which are specified in RFC 4976 [RFC4976]. This is illustrated in Figure 1 Niemi, et al. Expires July 26, 2012 [Page 7] Internet-Draft Multi-party Chat MSRP January 2012 MSRP Sessions +---------------------------+ | +-----------+ | +---+--+ +---+--+ | | | SIP | | SIP | | | | MSRP | | MSRP | +--+---+----+ |Client| |Client| | MSRP | +---+--+ ++-----+ | Relay | | | +-----+-----+ SIP Dialogs | / | | | | MSRP Sessions +----+------+--+ | | Conference | +-------+-----+ | Focus UA | | MSRP | | |........| Switch | | | | | +---+--------+-+ +-------+-----+ | \ | SIP Dialogs | | | MSRP Sessions | \ | +--+---+ +-+----+ +-----+------+ | SIP | | SIP | | MSRP | | MSRP | | MSRP | | Relay | |Client| |Client| +-+-------+--+ +---+--+ +--+---+ | | | +-----------+ | +------------------------------+ MSRP sessions Figure 1: Multi-party chat overview shown with MSRP Relays and a conference Focus UA The MSRP switch is similar to a conference mixer in that it handles media sessions with each of the participants and bridges these streams together. However, unlike a conference mixer, the MSRP switch merely forwards messages between participants but doesn't actually mix the streams in any way. The system is illustrated in Figure 2. The following events cause implicit renewal of all of the leases for a given client (i.e., all those sharing a given clientid). Each of these is a positive indication that the client is still active and that the associated state held at the server, for the client, is still valid. o An OPEN with a valid clientid. o Any operation made with a valid stateid (CLOSE, DELEGPURGE, DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, READ, RENEW, SETATTR, WRITE). This does not include the special stateids of all bits 0 or all bits 1. Note that if the client had restarted or rebooted, the client would not be making these requests without issuing the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of the SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that changes the client verifier) notifies the server to drop the locking state associated with the client. SETCLIENTID/SETCLIENTID_CONFIRM never renews a lease. If the server has rebooted, the stateids (NFS4ERR_STALE_STATEID error) or the clientid (NFS4ERR_STALE_CLIENTID error) will not be valid hence preventing spurious renewals. This approach allows for low overhead lease renewal which scales well. In the typical case no extra RPC calls are required for lease renewal and in the worst case one RPC is required every lease period (i.e., a RENEW operation). The number of locks held by the client is not a factor since all state for the client is involved with the lease renewal action. Since all operations that create a new lease also renew existing leases, the server must maintain a common lease expiration time for all valid leases for a given client. This lease time can then be easily updated upon implicit lease renewal actions. 8.6. Crash Recovery The important requirement in crash recovery is that both the client and the server know when the other has failed. Additionally, it is required that a client sees a consistent view of data across server restarts or reboots. All READ and WRITE operations that may have been queued within the client or network buffers must wait until the client has successfully recovered the locks protecting the READ and WRITE operations. Shepler, et al. Standards Track [Page 78] RFC 3530 NFS version 4 Protocol April 2003 8.6.1. Client Failure and Recovery In the event that a client fails, the server may recover the client's locks when the associated leases have expired. Conflicting locks from another client may only be granted after this lease expiration. If the client is able to restart or reinitialize within the lease period the client may be forced to wait the remainder of the lease period before obtaining new locks. To minimize client delay upon restart, lock requests are associated with an instance of the client by a client supplied verifier. This verifier is part of the initial SETCLIENTID call made by the client. The server returns a clientid as a result of the SETCLIENTID operation. The client then confirms the use of the clientid with SETCLIENTID_CONFIRM. The clientid in combination with an opaque owner field is then used by the client to identify the lock owner for OPEN. This chain of associations is then used to identify all locks for a particular client. Since the verifier will be changed by the client upon each initialization, the server can compare a new verifier to the verifier associated with currently held locks and determine that they do not match. This signifies the client's new instantiation and subsequent loss of locking state. As a result, the server is free to release all locks held which are associated with the old clientid which was derived from the old verifier. Note that the verifier must have the same uniqueness properties of the verifier for the COMMIT operation. 8.6.2. Server Failure and Recovery If the server loses locking state (usually as a result of a restart or reboot), it must allow clients time to discover this fact and re- establish the lost locking state. The client must be able to re- establish the locking state without having the server deny valid requests because the server has granted conflicting access to another client. Likewise, if there is the possibility that clients have not yet re-established their locking state for a file, the server must disallow READ and WRITE operations for that file. The duration of this recovery period is equal to the duration of the lease period. A client can determine that server failure (and thus loss of locking state) has occurred, when it receives one of two errors. The NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a Shepler, et al. Standards Track [Page 79] RFC 3530 NFS version 4 Protocol April 2003 clientid invalidated by reboot or restart. When either of these are received, the client must establish a new clientid (See the section "Client ID") and re-establish the locking state as discussed below. The period of special handling of locking and READs and WRITEs, equal in duration to the lease period, is referred to as the "grace period". During the grace period, clients recover locks and the associated state by reclaim-type locking requests (i.e., LOCK requests with reclaim set to true and OPEN operations with a claim type of CLAIM_PREVIOUS). During the grace period, the server must reject READ and WRITE operations and non-reclaim locking requests (i.e., other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. If the server can reliably determine that granting a non-reclaim request will not conflict with reclamation of locks by other clients, the NFS4ERR_GRACE error does not have to be returned and the non- reclaim client request can be serviced. For the server to be able to service READ and WRITE operations during the grace period, it must again be able to guarantee that no possible conflict could arise between an impending reclaim locking request and the READ or WRITE operation. If the server is unable to offer that guarantee, the NFS4ERR_GRACE error must be returned to the client. For a server to provide simple, valid handling during the grace period, the easiest method is to simply reject all non-reclaim locking requests and READ and WRITE operations by returning the NFS4ERR_GRACE error. However, a server may keep information about granted locks in stable storage. With this information, the server could determine if a regular lock or READ or WRITE operation can be safely processed. For example, if a count of locks on a given file is available in stable storage, the server can track reclaimed locks for the file and when all reclaims have been processed, non-reclaim locking requests may be processed. This way the server can ensure that non-reclaim locking requests will not conflict with potential reclaim requests. With respect to I/O requests, if the server is able to determine that there are no outstanding reclaim requests for a file by information from stable storage or another similar mechanism, the processing of I/O requests could proceed normally for the file. To reiterate, for a server that allows non-reclaim lock and I/O requests to be processed during the grace period, it MUST determine that no lock subsequently reclaimed will be rejected and that no lock subsequently reclaimed would have prevented any I/O operation processed during the grace period. Shepler, et al. Standards Track [Page 80] RFC 3530 NFS version 4 Protocol April 2003 Clients should be prepared for the return of NFS4ERR_GRACE errors for non-reclaim lock and I/O requests. In this case the client should employ a retry mechanism for the request. A delay (on the order of several seconds) between retries should be used to avoid overwhelming the server. Further discussion of the general issue is included in [Floyd]. The client must account for the server that is able to perform I/O and non-reclaim locking requests within the grace period as well as those that can not do so. A reclaim-type locking request outside the server's grace period can only succeed if the server can guarantee that no conflicting lock or I/O request has been granted since reboot or restart. A server may, upon restart, establish a new value for the lease period. Therefore, clients should, once a new clientid is established, refetch the lease_time attribute and use it as the basis for lease renewal for the lease associated with that server. However, the server must establish, for this restart event, a grace period at least as long as the lease period for the previous server instantiation. This allows the client state obtained during the previous server instance to be reliably re-established. 8.6.3. Network Partitions and Recovery If the duration of a network partition is greater than the lease period provided by the server, the server will have not received a lease renewal from the client. If this occurs, the server may free all locks held for the client. As a result, all stateids held by the client will become invalid or stale. Once the client is able to reach the server after such a network partition, all I/O submitted by the client with the now invalid stateids will fail with the server returning the error NFS4ERR_EXPIRED. Once this error is received, the client will suitably notify the application that held the lock. As a courtesy to the client or as an optimization, the server may continue to hold locks on behalf of a client for which recent communication has extended beyond the lease period. If the server receives a lock or I/O request that conflicts with one of these courtesy locks, the server must free the courtesy lock and grant the new request. When a network partition is combined with a server reboot, there are edge conditions that place requirements on the server in order to avoid silent data corruption following the server reboot. Two of these edge conditions are known, and are discussed below. Shepler, et al. Standards Track [Page 81] RFC 3530 NFS version 4 Protocol April 2003 The first edge condition has the following scenario: 1. Client A acquires a lock. 2. Client A and server experience mutual network partition, such that client A is unable to renew its lease. 3. Client A's lease expires, so server releases lock. 4. Client B acquires a lock that would have conflicted with that of Client A. 5. Client B releases the lock 6. Server reboots 7. Network partition between client A and server heals. 8. Client A issues a RENEW operation, and gets back a NFS4ERR_STALE_CLIENTID. 9. Client A reclaims its lock within the server's grace period. Thus, at the final step, the server has erroneously granted client A's lock reclaim. If client B modified the object the lock was protecting, client A will experience object corruption. The second known edge condition follows: 1. Client A acquires a lock. 2. Server reboots. 3. Client A and server experience mutual network partition, such that client A is unable to reclaim its lock within the grace period. 4. Server's reclaim grace period ends. Client A has no locks recorded on server. 5. Client B acquires a lock that would have conflicted with that of Client A. 6. Client B releases the lock. 7. Server reboots a second time. 8. Network partition between client A and server heals. Shepler, et al. Standards Track [Page 82] RFC 3530 NFS version 4 Protocol April 2003 9. Client A issues a RENEW operation, and gets back a NFS4ERR_STALE_CLIENTID. 10. Client A reclaims its lock within the server's grace period. As with the first edge condition, the final step of the scenario of the second edge condition has the server erroneously granting client A's lock reclaim. Solving the first and second edge conditions requires that the server either assume after it reboots that edge condition occurs, and thus return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server record some information stable storage. The amount of information the server records in stable storage is in inverse proportion to how harsh the server wants to be whenever the edge conditions occur. The server that is completely tolerant of all edge conditions will record in stable storage every lock that is acquired, removing the lock record from stable storage only when the lock is unlocked by the client and the lock's lockowner advances the sequence number such that the lock release is not the last stateful event for the lockowner's sequence. For the two aforementioned edge conditions, the harshest a server can be, and still support a grace period for reclaims, requires that the server record in stable storage information some minimal information. For example, a server implementation could, for each client, save in stable storage a record containing: o the client's id string o a boolean that indicates if the client's lease expired or if there was administrative intervention (see the section, Server Revocation of Locks) to revoke a record lock, share reservation, or delegation o a timestamp that is updated the first time after a server boot or reboot the client acquires record locking, share reservation, or delegation state on the server. The timestamp need not be updated on subsequent lock requests until the server reboots. The server implementation would also record in the stable storage the timestamps from the two most recent server reboots. Assuming the above record keeping, for the first edge condition, after the server reboots, the record that client A's lease expired means that another client could have acquired a conflicting record lock, share reservation, or delegation. Hence the server must reject a reclaim from client A with the error NFS4ERR_NO_GRACE. Shepler, et al. Standards Track [Page 83] RFC 3530 NFS version 4 Protocol April 2003 For the second edge condition, after the server reboots for a second time, the record that the client had an unexpired record lock, share reservation, or delegation established before the server's previous incarnation means that the server must reject a reclaim from client A with the error NFS4ERR_NO_GRACE. Regardless of the level and approach to record keeping, the server MUST implement one of the following strategies (which apply to reclaims of share reservations, record locks, and delegations): 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is superharsh, but necessary if the server does not want to record lock state in stable storage. 2. Record sufficient state in stable storage such that all known edge conditions involving server reboot, including the two noted in this section, are detected. False positives are acceptable. Note that at this time, it is not known if there are other edge conditions. In the event, after a server reboot, the server determines that there is unrecoverable damage or corruption to the the stable storage, then for all clients and/or locks affected, the server MUST return NFS4ERR_NO_GRACE. A mandate for the client's handling of the NFS4ERR_NO_GRACE error is outside the scope of this specification, since the strategies for such handling are very dependent on the client's operating environment. However, one potential approach is described below. When the client receives NFS4ERR_NO_GRACE, it could examine the change attribute of the objects the client is trying to reclaim state for, and use that to determine whether to re-establish the state via normal OPEN or LOCK requests. This is acceptable provided the client's operating environment allows it. In otherwords, the client implementor is advised to document for his users the behavior. The client could also inform the application that its record lock or share reservations (whether they were delegated or not) have been lost, such as via a UNIX signal, a GUI pop-up window, etc. See the section, "Data Caching and Revocation" for a discussion of what the client should do for dealing with unreclaimed delegations on client state. For further discussion of revocation of locks see the section "Server Revocation of Locks". Shepler, et al. Standards Track [Page 84] RFC 3530 NFS version 4 Protocol April 2003 8.7. Recovery from a Lock Request Timeout or Abort In the event a lock request times out, a client may decide to not retry the request. The client may also abort the request when the process for which it was issued is terminated (e.g., in UNIX due to a signal). It is possible though that the server received the request and acted upon it. This would change the state on the server without the client being aware of the change. It is paramount that the client re-synchronize state with server before it attempts any other operation that takes a seqid and/or a stateid with the same lock_owner. This is straightforward to do without a special re- synchronize operation. Since the server maintains the last lock request and response received on the lock_owner, for each lock_owner, the client should cache the last lock request it sent such that the lock request did not receive a response. From this, the next time the client does a lock operation for the lock_owner, it can send the cached request, if there is one, and if the request was one that established state (e.g., a LOCK or OPEN operation), the server will return the cached result or if never saw the request, perform it. The client can follow up with a request to remove the state (e.g., a LOCKU or CLOSE operation). With this approach, the sequencing and stateid information on the client and server for the given lock_owner will re-synchronize and in turn the lock state will re-synchronize. 8.8. Server Revocation of Locks At any point, the server can revoke locks held by a client and the client must be prepared for this event. When the client detects that its locks have been or may have been revoked, the client is responsible for validating the state information between itself and the server. Validating locking state for the client means that it must verify or reclaim state for each lock currently held. The first instance of lock revocation is upon server reboot or re- initialization. In this instance the client will receive an error (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will proceed with normal crash recovery as described in the previous section. The second lock revocation event is the inability to renew the lease before expiration. While this is considered a rare or unusual event, the client must be prepared to recover. Both the server and client will be able to detect the failure to renew the lease and are capable of recovering without data corruption. For the server, it tracks the last renewal event serviced for the client and knows when the lease will expire. Similarly, the client must track operations which will Shepler, et al. Standards Track [Page 85] RFC 3530 NFS version 4 Protocol April 2003 renew the lease period. Using the time that each such request was sent and the time that the corresponding reply was received, the client should bound the time that the corresponding renewal could have occurred on the server and thus determine if it is possible that a lease period expiration could have occurred. The third lock revocation event can occur as a result of administrative intervention within the lease period. While this is considered a rare event, it is possible that the server's administrator has decided to release or revoke a particular lock held by the client. As a result of revocation, the client will receive an error of NFS4ERR_ADMIN_REVOKED. In this instance the client may assume that only the lock_owner's locks have been lost. The client notifies the lock holder appropriately. The client may not assume the lease period has been renewed as a result of failed operation. When the client determines the lease period may have expired, the client must mark all locks held for the associated lease as "unvalidated". This means the client has been unable to re-establish or confirm the appropriate lock state with the server. As described in the previous section on crash recovery, there are scenarios in which the server may grant conflicting locks after the lease period has expired for a client. When it is possible that the lease period has expired, the client must validate each lock currently held to ensure that a conflicting lock has not been granted. The client may accomplish this task by issuing an I/O request, either a pending I/O or a zero-length read, specifying the stateid associated with the lock in question. If the response to the request is success, the client has validated all of the locks governed by that stateid and re-established the appropriate state between itself and the server. If the I/O request is not successful, then one or more of the locks associated with the stateid was revoked by the server and the client must notify the owner. 8.9. Share Reservations A share reservation is a mechanism to control access to a file. It is a separate and independent mechanism from record locking. When a client opens a file, it issues an OPEN operation to the server specifying the type of access required (READ, WRITE, or BOTH) and the type of access to deny others (deny NONE, READ, WRITE, or BOTH). If the OPEN fails the client will fail the application's open request. Pseudo-code definition of the semantics: if (request.access == 0) return (NFS4ERR_INVAL) Shepler, et al. Standards Track [Page 86] RFC 3530 NFS version 4 Protocol April 2003 else if ((request.access & file_state.deny)) || (request.deny & file_state.access)) return (NFS4ERR_DENIED) This checking of share reservations on OPEN is done with no exception for an existing OPEN for the same open_owner. The constants used for the OPEN and OPEN_DOWNGRADE operations for the access and deny fields are as follows: const OPEN4_SHARE_ACCESS_READ = 0x00000001; const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; const OPEN4_SHARE_DENY_NONE = 0x00000000; const OPEN4_SHARE_DENY_READ = 0x00000001; const OPEN4_SHARE_DENY_WRITE = 0x00000002; const OPEN4_SHARE_DENY_BOTH = 0x00000003; 8.10. OPEN/CLOSE Operations To provide correct share semantics, a client MUST use the OPEN operation to obtain the initial filehandle and indicate the desired access and what if any access to deny. Even if the client intends to use a stateid of all 0's or all 1's, it must still obtain the filehandle for the regular file with the OPEN operation so the appropriate share semantics can be applied. For clients that do not have a deny mode built into their open programming interfaces, deny equal to NONE should be used. The OPEN operation with the CREATE flag, also subsumes the CREATE operation for regular files as used in previous versions of the NFS protocol. This allows a create with a share to be done atomically. The CLOSE operation removes all share reservations held by the lock_owner on that file. If record locks are held, the client SHOULD release all locks before issuing a CLOSE. The server MAY free all outstanding locks on CLOSE but some servers may not support the CLOSE of a file that still has record locks held. The server MUST return failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the CLOSE. The LOOKUP operation will return a filehandle without establishing any lock state on the server. Without a valid stateid, the server will assume the client has the least access. For example, a file Shepler, et al. Standards Track [Page 87] RFC 3530 NFS version 4 Protocol April 2003 opened with deny READ/WRITE cannot be accessed using a filehandle obtained through LOOKUP because it would not have a valid stateid (i.e., using a stateid of all bits 0 or all bits 1). 8.10.1. Close and Retention of State Information Since a CLOSE operation requests deallocation of a stateid, dealing with retransmission of the CLOSE, may pose special difficulties, since the state information, which normally would be used to determine the state of the open file being designated, might be deallocated, resulting in an NFS4ERR_BAD_STATEID error. Servers may deal with this problem in a number of ways. To provide the greatest degree assurance that the protocol is being used properly, a server should, rather than deallocate the stateid, mark it as close-pending, and retain the stateid with this status, until later deallocation. In this way, a retransmitted CLOSE can be recognized since the stateid points to state information with this distinctive status, so that it can be handled without error. When adopting this strategy, a server should retain the state information until the earliest of: o Another validly sequenced request for the same lockowner, that is not a retransmission. o The time that a lockowner is freed by the server due to period with no activity. o All locks for the client are freed as a result of a SETCLIENTID. Servers may avoid this complexity, at the cost of less complete protocol error checking, by simply responding NFS4_OK in the event of a CLOSE for a deallocated stateid, on the assumption that this case must be caused by a retransmitted close. When adopting this approach, it is desirable to at least log an error when returning a no-error indication in this situation. If the server maintains a reply-cache mechanism, it can verify the CLOSE is indeed a retransmission and avoid error logging in most cases. 8.11. Open Upgrade and Downgrade When an OPEN is done for a file and the lockowner for which the open is being done already has the file open, the result is to upgrade the open file status maintained on the server to include the access and deny bits specified by the new OPEN as well as those for the existing OPEN. The result is that there is one open file, as far as the protocol is concerned, and it includes the union of the access and Shepler, et al. Standards Track [Page 88] RFC 3530 NFS version 4 Protocol April 2003 deny bits for all of the OPEN requests completed. Only a single CLOSE will be done to reset the effects of both OPENs. Note that the client, when issuing the OPEN, may not know that the same file is in fact being opened. The above only applies if both OPENs result in the OPENed object being designated by the same filehandle. When the server chooses to export multiple filehandles corresponding to the same file object and returns different filehandles on two different OPENs of the same file object, the server MUST NOT "OR" together the access and deny bits and coalesce the two open files. Instead the server must maintain separate OPENs with separate stateids and will require separate CLOSEs to free them. When multiple open files on the client are merged into a single open file object on the server, the close of one of the open files (on the client) may necessitate change of the access and deny status of the open file on the server. This is because the union of the access and deny bits for the remaining opens may be smaller (i.e., a proper subset) than previously. The OPEN_DOWNGRADE operation is used to make the necessary change and the client should use it to update the server so that share reservation requests by other clients are handled properly. 8.12. Short and Long Leases When determining the time period for the server lease, the usual lease tradeoffs apply. Short leases are good for fast server recovery at a cost of increased RENEW or READ (with zero length) requests. Longer leases are certainly kinder and gentler to servers trying to handle very large numbers of clients. The number of RENEW requests drop in proportion to the lease time. The disadvantages of long leases are slower recovery after server failure (the server must wait for the leases to expire and the grace period to elapse before granting new lock requests) and increased file contention (if client fails to transmit an unlock request then server must wait for lease expiration before granting new locks). Long leases are usable if the server is able to store lease state in non-volatile memory. Upon recovery, the server can reconstruct the lease state from its non-volatile memory and continue operation with its clients and therefore long leases would not be an issue. 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration To avoid the need for synchronized clocks, lease times are granted by the server as a time delta. However, there is a requirement that the client and server clocks do not drift excessively over the duration of the lock. There is also the issue of propagation delay across the Shepler, et al. Standards Track [Page 89] RFC 3530 NFS version 4 Protocol April 2003 network which could easily be several hundred milliseconds as well as the possibility that requests will be lost and need to be retransmitted. To take propagation delay into account, the client should subtract it from lease times (e.g., if the client estimates the one-way propagation delay as 200 msec, then it can assume that the lease is already 200 msec old when it gets it). In addition, it will take another 200 msec to get a response back to the server. So the client must send a lock renewal or write data back to the server 400 msec before the lease would expire. The server's lease period configuration should take into account the network distance of the clients that will be accessing the server's resources. It is expected that the lease period will take into account the network propagation delays and other network delay factors for the client population. Since the protocol does not allow for an automatic method to determine an appropriate lease period, the server's administrator may have to tune the lease period. 8.14. Migration, Replication and State When responsibility for handling a given file system is transferred to a new server (migration) or the client chooses to use an alternate server (e.g., in response to server unresponsiveness) in the context of file system replication, the appropriate handling of state shared between the client and server (i.e., locks, leases, stateids, and clientids) is as described below. The handling differs between migration and replication. For related discussion of file server state and recover of such see the sections under "File Locking and Share Reservations". If server replica or a server immigrating a filesystem agrees to, or is expected to, accept opaque values from the client that originated from another server, then it is a wise implementation practice for the servers to encode the "opaque" values in network byte order. This way, servers acting as replicas or immigrating filesystems will be able to parse values like stateids, directory cookies, filehandles, etc. even if their native byte order is different from other servers cooperating in the replication and migration of the filesystem. 8.14.1. Migration and State In the case of migration, the servers involved in the migration of a filesystem SHOULD transfer all server state from the original to the new server. This must be done in a way that is transparent to the client. This state transfer will ease the client's transition when a Shepler, et al. Standards Track [Page 90] RFC 3530 NFS version 4 Protocol April 2003 filesystem migration occurs. If the servers are successful in transferring all state, the client will continue to use stateids assigned by the original server. Therefore the new server must recognize these stateids as valid. This holds true for the clientid as well. Since responsibility for an entire filesystem is transferred with a migration event, there is no possibility that conflicts will arise on the new server as a result of the transfer of locks. As part of the transfer of information between servers, leases would be transferred as well. The leases being transferred to the new server will typically have a different expiration time from those for the same client, previously on the old server. To maintain the property that all leases on a given server for a given client expire at the same time, the server should advance the expiration time to the later of the leases being transferred or the leases already present. This allows the client to maintain lease renewal of both classes without special effort. The servers may choose not to transfer the state information upon migration. However, this choice is discouraged. In this case, when the client presents state information from the original server, the client must be prepared to receive either NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server. The client should then recover its state information as it normally would in response to a server failure. The new server must take care to allow for the recovery of state information as it would in the event of server restart. 8.14.2. Replication and State Since client switch-over in the case of replication is not under server control, the handling of state is different. In this case, leases, stateids and clientids do not have validity across a transition from one server to another. The client must re-establish its locks on the new server. This can be compared to the re- establishment of locks by means of reclaim-type requests after a server reboot. The difference is that the server has no provision to distinguish requests reclaiming locks from those obtaining new locks or to defer the latter. Thus, a client re-establishing a lock on the new server (by means of a LOCK or OPEN request), may have the requests denied due to a conflicting lock. Since replication is intended for read-only use of filesystems, such denial of locks should not pose large difficulties in practice. When an attempt to re-establish a lock on a new server is denied, the client should treat the situation as if his original lock had been revoked. Shepler, et al. Standards Track [Page 91] RFC 3530 NFS version 4 Protocol April 2003 8.14.3. Notification of Migrated Lease In the case of lease renewal, the client may not be submitting requests for a filesystem that has been migrated to another server. This can occur because of the implicit lease renewal mechanism. The client renews leases for all filesystems when submitting a request to any one filesystem at the server. In order for the client to schedule renewal of leases that may have been relocated to the new server, the client must find out about lease relocation before those leases expire. To accomplish this, all operations which implicitly renew leases for a client (i.e., OPEN, CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be renewed has been transferred to a new server. This condition will continue until the client receives an NFS4ERR_MOVED error and the server receives the subsequent GETATTR(fs_locations) for an access to each filesystem for which a lease has been moved to a new server. When a client receives an NFS4ERR_LEASE_MOVED error, it should perform an operation on each filesystem associated with the server in question. When the client receives an NFS4ERR_MOVED error, the client can follow the normal process to obtain the new server information (through the fs_locations attribute) and perform renewal of those leases on the new server. If the server has not had state transferred to it transparently, the client will receive either NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server, as described above, and the client can then recover state information as it does in the event of server failure. 8.14.4. Migration and the Lease_time Attribute In order that the client may appropriately manage its leases in the case of migration, the destination server must establish proper values for the lease_time attribute. When state is transferred transparently, that state should include the correct value of the lease_time attribute. The lease_time attribute on the destination server must never be less than that on the source since this would result in premature expiration of leases granted by the source server. Upon migration in which state is transferred transparently, the client is under no obligation to re- fetch the lease_time attribute and may continue to use the value previously fetched (on the source server). If state has not been transferred transparently (i.e., the client sees a real or simulated server reboot), the client should fetch the value of lease_time on the new (i.e., destination) server, and use it Shepler, et al. Standards Track [Page 92] RFC 3530 NFS version 4 Protocol April 2003 for subsequent locking requests. However the server must respect a grace period at least as long as the lease_time on the source server, in order to ensure that clients have ample time to reclaim their locks before potentially conflicting non-reclaimed locks are granted. The means by which the new server obtains the value of lease_time on the old server is left to the server implementations. It is not specified by the NFS version 4 protocol. 9. Client-Side Caching Client-side caching of data, of file attributes, and of file names is essential to providing good performance with the NFS protocol. Providing distributed cache coherence is a difficult problem and previous versions of the NFS protocol have not attempted it. Instead, several NFS client implementation techniques have been used to reduce the problems that a lack of coherence poses for users. These techniques have not been clearly defined by earlier protocol specifications and it is often unclear what is valid or invalid client behavior. The NFS version 4 protocol uses many techniques similar to those that have been used in previous protocol versions. The NFS version 4 protocol does not provide distributed cache coherence. However, it defines a more limited set of caching guarantees to allow locks and share reservations to be used without destructive interference from client side caching. In addition, the NFS version 4 protocol introduces a delegation mechanism which allows many decisions normally made by the server to be made locally by clients. This mechanism provides efficient support of the common cases where sharing is infrequent or where sharing is read-only. 9.1. Performance Challenges for Client-Side Caching Caching techniques used in previous versions of the NFS protocol have been successful in providing good performance. However, several scalability challenges can arise when those techniques are used with very large numbers of clients. This is particularly true when clients are geographically distributed which classically increases the latency for cache revalidation requests. The previous versions of the NFS protocol repeat their file data cache validation requests at the time the file is opened. This behavior can have serious performance drawbacks. A common case is one in which a file is only accessed by a single client. Therefore, sharing is infrequent. Shepler, et al. Standards Track [Page 93] RFC 3530 NFS version 4 Protocol April 2003 In this case, repeated reference to the server to find that no conflicts exist is expensive. A better option with regards to performance is to allow a client that repeatedly opens a file to do so without reference to the server. This is done until potentially conflicting operations from another client actually occur. A similar situation arises in connection with file locking. Sending file lock and unlock requests to the server as well as the read and write requests necessary to make data caching consistent with the locking semantics (see the section "Data Caching and File Locking") can severely limit performance. When locking is used to provide protection against infrequent conflicts, a large penalty is incurred. This penalty may discourage the use of file locking by applications. The NFS version 4 protocol provides more aggressive caching strategies with the following design goals: o Compatibility with a large range of server semantics. o Provide the same caching benefits as previous versions of the NFS protocol when unable to provide the more aggressive model. o Requirements for aggressive caching are organized so that a large portion of the benefit can be obtained even when not all of the requirements can be met. The appropriate requirements for the server are discussed in later sections in which specific forms of caching are covered. (see the section "Open Delegation"). 9.2. Delegation and Callbacks Recallable delegation of server responsibilities for a file to a client improves performance by avoiding repeated requests to the server in the absence of inter-client conflict. With the use of a "callback" RPC from server to client, a server recalls delegated responsibilities when another client engages in sharing of a delegated file. A delegation is passed from the server to the client, specifying the object of the delegation and the type of delegation. There are different types of delegations but each type contains a stateid to be used to represent the delegation when performing operations that depend on the delegation. This stateid is similar to those associated with locks and share reservations but differs in that the stateid for a delegation is associated with a clientid and may be Shepler, et al. Standards Track [Page 94] RFC 3530 NFS version 4 Protocol April 2003 used on behalf of all the open_owners for the given client. A delegation is made to the client as a whole and not to any specific process or thread of control within it. Because callback RPCs may not work in all environments (due to firewalls, for example), correct protocol operation does not depend on them. Preliminary testing of callback functionality by means of a CB_NULL procedure determines whether callbacks can be supported. The CB_NULL procedure checks the continuity of the callback path. A server makes a preliminary assessment of callback availability to a given client and avoids delegating responsibilities until it has determined that callbacks are supported. Because the granting of a delegation is always conditional upon the absence of conflicting access, clients must not assume that a delegation will be granted and they must always be prepared for OPENs to be processed without any delegations being granted. Once granted, a delegation behaves in most ways like a lock. There is an associated lease that is subject to renewal together with all of the other leases held by that client. Unlike locks, an operation by a second client to a delegated file will cause the server to recall a delegation through a callback. On recall, the client holding the delegation must flush modified state (such as modified data) to the server and return the delegation. The conflicting request will not receive a response until the recall is complete. The recall is considered complete when the client returns the delegation or the server times out on the recall and revokes the delegation as a result of the timeout. Following the resolution of the recall, the server has the information necessary to grant or deny the second client's request. At the time the client receives a delegation recall, it may have substantial state that needs to be flushed to the server. Therefore, the server should allow sufficient time for the delegation to be returned since it may involve numerous RPCs to the server. If the server is able to determine that the client is diligently flushing state to the server as a result of the recall, the server may extend the usual time allowed for a recall. However, the time allowed for recall completion should not be unbounded. An example of this is when responsibility to mediate opens on a given file is delegated to a client (see the section "Open Delegation"). The server will not know what opens are in effect on the client. Without this knowledge the server will be unable to determine if the access and deny state for the file allows any particular open until the delegation for the file has been returned. Shepler, et al. Standards Track [Page 95] RFC 3530 NFS version 4 Protocol April 2003 A client failure or a network partition can result in failure to respond to a recall callback. In this case, the server will revoke the delegation which in turn will render useless any modified state still on the client. 9.2.1. Delegation Recovery There are three situations that delegation recovery must deal with: o Client reboot or restart o Server reboot or restart o Network partition (full or callback-only) In the event the client reboots or restarts, the failure to renew leases will result in the revocation of record locks and share reservations. Delegations, however, may be treated a bit differently. There will be situations in which delegations will need to be reestablished after a client reboots or restarts. The reason for this is the client may have file data stored locally and this data was associated with the previously held delegations. The client will need to reestablish the appropriate file state on the server. To allow for this type of client recovery, the server MAY extend the period for delegation recovery beyond the typical lease expiration period. This implies that requests from other clients that conflict with these delegations will need to wait. Because the normal recall process may require significant time for the client to flush changed state to the server, other clients need be prepared for delays that occur because of a conflicting delegation. This longer interval would increase the window for clients to reboot and consult stable storage so that the delegations can be reclaimed. For open delegations, such delegations are reclaimed using OPEN with a claim type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and Revocation" and "Operation 18: OPEN" for discussion of open delegation and the details of OPEN respectively). A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and instead MUST, for a period of time no less than that of the value of the lease_time attribute, maintain the client's delegations to allow time for the client to issue CLAIM_DELEGATE_PREV requests. The server that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE operation. Shepler, et al. Standards Track [Page 96] RFC 3530 NFS version 4 Protocol April 2003 When the server reboots or restarts, delegations are reclaimed (using the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to record locks and share reservations. However, there is a slight semantic difference. In the normal case if the server decides that a delegation should not be granted, it performs the requested action (e.g., OPEN) without granting any delegation. For reclaim, the server grants the delegation but a special designation is applied so that the client treats the delegation as having been granted but recalled by the server. Because of this, the client has the duty to write all modified state to the server and then return the delegation. This process of handling delegation reclaim reconciles three principles of the NFS version 4 protocol: o Upon reclaim, a client reporting resources assigned to it by an earlier server instance must be granted those resources. o The server has unquestionable authority to determine whether delegations are to be granted and, once granted, whether they are to be continued. o The use of callbacks is not to be depended upon until the client has proven its ability to receive them. When a network partition occurs, delegations are subject to freeing by the server when the lease renewal period expires. This is similar to the behavior for locks and share reservations. For delegations, however, the server may extend the period in which conflicting requests are held off. Eventually the occurrence of a conflicting request from another client will cause revocation of the delegation. A loss of the callback path (e.g., by later network configuration change) will have the same effect. A recall request will fail and revocation of the delegation will result. A client normally finds out about revocation of a delegation when it uses a stateid associated with a delegation and receives the error NFS4ERR_EXPIRED. It also may find out about delegation revocation after a client reboot when it attempts to reclaim a delegation and receives that same error. Note that in the case of a revoked write open delegation, there are issues because data may have been modified by the client whose delegation is revoked and separately by other clients. See the section "Revocation Recovery for Write Open Delegation" for a discussion of such issues. Note also that when delegations are revoked, information about the revoked delegation will be written by the server to stable storage (as described in the section "Crash Recovery"). This is done to deal with the case in which a server reboots after revoking a delegation but before the client holding the revoked delegation is notified about the revocation. Shepler, et al. Standards Track [Page 97] RFC 3530 NFS version 4 Protocol April 2003 9.3. Data Caching When applications share access to a set of files, they need to be implemented so as to take account of the possibility of conflicting access by another application. This is true whether the applications in question execute on different clients or reside on the same client. Share reservations and record locks are the facilities the NFS version 4 protocol provides to allow applications to coordinate access by providing mutual exclusion facilities. The NFS version 4 protocol's data caching must be implemented such that it does not invalidate the assumptions that those using these facilities depend upon. 9.3.1. Data Caching and OPENs In order to avoid invalidating the sharing assumptions that applications rely on, NFS version 4 clients should not provide cached data to applications or modify it on behalf of an application when it would not be valid to obtain or modify that same data via a READ or WRITE operation. Furthermore, in the absence of open delegation (see the section "Open Delegation") two additional rules apply. Note that these rules are obeyed in practice by many NFS version 2 and version 3 clients. o First, cached data present on a client must be revalidated after doing an OPEN. Revalidating means that the client fetches the change attribute from the server, compares it with the cached change attribute, and if different, declares the cached data (as well as the cached attributes) as invalid. This is to ensure that the data for the OPENed file is still correctly reflected in the client's cache. This validation must be done at least when the client's OPEN operation includes DENY=WRITE or BOTH thus terminating a period in which other clients may have had the opportunity to open the file with WRITE access. Clients may choose to do the revalidation more often (i.e., at OPENs specifying DENY=NONE) to parallel the NFS version 3 protocol's practice for the benefit of users assuming this degree of cache revalidation. Since the change attribute is updated for data and metadata modifications, some client implementors may be tempted to use the time_modify attribute and not change to validate cached data, so that metadata changes do not spuriously invalidate clean data. The implementor is cautioned in this approach. The change attribute is guaranteed to change for each update to the file, Shepler, et al. Standards Track [Page 98] RFC 3530 NFS version 4 Protocol April 2003 whereas time_modify is guaranteed to change only at the granularity of the time_delta attribute. Use by the client's data cache validation logic of time_modify and not change runs the risk of the client incorrectly marking stale data as valid. o Second, modified data must be flushed to the server before closing a file OPENed for write. This is complementary to the first rule. If the data is not flushed at CLOSE, the revalidation done after client OPENs as file is unable to achieve its purpose. The other aspect to flushing the data before close is that the data must be committed to stable storage, at the server, before the CLOSE operation is requested by the client. In the case of a server reboot or restart and a CLOSEd file, it may not be possible to retransmit the data to be written to the file. Hence, this requirement. 9.3.2. Data Caching and File Locking For those applications that choose to use file locking instead of share reservations to exclude inconsistent file access, there is an analogous set of constraints that apply to client side data caching. These rules are effective only if the file locking is used in a way that matches in an equivalent way the actual READ and WRITE operations executed. This is as opposed to file locking that is based on pure convention. For example, it is possible to manipulate a two-megabyte file by dividing the file into two one-megabyte regions and protecting access to the two regions by file locks on bytes zero and one. A lock for write on byte zero of the file would represent the right to do READ and WRITE operations on the first region. A lock for write on byte one of the file would represent the right to do READ and WRITE operations on the second region. As long as all applications manipulating the file obey this convention, they will work on a local filesystem. However, they may not work with the NFS version 4 protocol unless clients refrain from data caching. The rules for data caching in the file locking environment are: o First, when a client obtains a file lock for a particular region, the data cache corresponding to that region (if any cached data exists) must be revalidated. If the change attribute indicates that the file may have been updated since the cached data was obtained, the client must flush or invalidate the cached data for the newly locked region. A client might choose to invalidate all of non-modified cached data that it has for the file but the only requirement for correct operation is to invalidate all of the data in the newly locked region. Shepler, et al. Standards Track [Page 99] RFC 3530 NFS version 4 Protocol April 2003 o Second, before releasing a write lock for a region, all modified data for that region must be flushed to the server. The modified data must also be written to stable storage. Note that flushing data to the server and the invalidation of cached data must reflect the actual byte ranges locked or unlocked. Rounding these up or down to reflect client cache block boundaries will cause problems if not carefully done. For example, writing a modified block when only half of that block is within an area being unlocked may cause invalid modification to the region outside the unlocked area. This, in turn, may be part of a region locked by another client. Clients can avoid this situation by synchronously performing portions of write operations that overlap that portion (initial or final) that is not a full block. Similarly, invalidating a locked area which is not an integral number of full buffer blocks would require the client to read one or two partial blocks from the server if the revalidation procedure shows that the data which the client possesses may not be valid. The data that is written to the server as a prerequisite to the unlocking of a region must be written, at the server, to stable storage. The client may accomplish this either with synchronous writes or by following asynchronous writes with a COMMIT operation. This is required because retransmission of the modified data after a server reboot might conflict with a lock held by another client. A client implementation may choose to accommodate applications which use record locking in non-standard ways (e.g., using a record lock as a global semaphore) by flushing to the server more data upon an LOCKU than is covered by the locked range. This may include modified data within files other than the one for which the unlocks are being done. In such cases, the client must not interfere with applications whose READs and WRITEs are being done only within the bounds of record locks which the application holds. For example, an application locks a single byte of a file and proceeds to write that single byte. A client that chose to handle a LOCKU by flushing all modified data to the server could validly write that single byte in response to an unrelated unlock. However, it would not be valid to write the entire block in which that single written byte was located since it includes an area that is not locked and might be locked by another client. Client implementations can avoid this problem by dividing files with modified data into those for which all modifications are done to areas covered by an appropriate record lock and those for which there are modifications not covered by a record lock. Any writes done for the former class of files must not include areas not locked and thus not modified on the client. Shepler, et al. Standards Track [Page 100] RFC 3530 NFS version 4 Protocol April 2003 9.3.3. Data Caching and Mandatory File Locking Client side data caching needs to respect mandatory file locking when it is in effect. The presence of mandatory file locking for a given file is indicated when the client gets back NFS4ERR_LOCKED from a READ or WRITE on a file it has an appropriate share reservation for. When mandatory locking is in effect for a file, the client must check for an appropriate file lock for data being read or written. If a lock exists for the range being read or written, the client may satisfy the request using the client's validated cache. If an appropriate file lock is not held for the range of the read or write, the read or write request must not be satisfied by the client's cache and the request must be sent to the server for processing. When a read or write request partially overlaps a locked region, the request should be subdivided into multiple pieces with each region (locked or not) treated appropriately. 9.3.4. Data Caching and File Identity When clients cache data, the file data needs to be organized according to the filesystem object to which the data belongs. For NFS version 3 clients, the typical practice has been to assume for the purpose of caching that distinct filehandles represent distinct filesystem objects. The client then has the choice to organize and maintain the data cache on this basis. In the NFS version 4 protocol, there is now the possibility to have significant deviations from a "one filehandle per object" model because a filehandle may be constructed on the basis of the object's pathname. Therefore, clients need a reliable method to determine if two filehandles designate the same filesystem object. If clients were simply to assume that all distinct filehandles denote distinct objects and proceed to do data caching on this basis, caching inconsistencies would arise between the distinct client side objects which mapped to the same server side object. By providing a method to differentiate filehandles, the NFS version 4 protocol alleviates a potential functional regression in comparison with the NFS version 3 protocol. Without this method, caching inconsistencies within the same client could occur and this has not been present in previous versions of the NFS protocol. Note that it is possible to have such inconsistencies with applications executing on multiple clients but that is not the issue being addressed here. For the purposes of data caching, the following steps allow an NFS version 4 client to determine whether two distinct filehandles denote the same server side object: Shepler, et al. Standards Track [Page 101] RFC 3530 NFS version 4 Protocol April 2003 o If GETATTR directed to two filehandles returns different values of the fsid attribute, then the filehandles represent distinct objects. o If GETATTR for any file with an fsid that matches the fsid of the two filehandles in question returns a unique_handles attribute with a value of TRUE, then the two objects are distinct. o If GETATTR directed to the two filehandles does not return the fileid attribute for both of the handles, then it cannot be determined whether the two objects are the same. Therefore, operations which depend on that knowledge (e.g., client side data caching) cannot be done reliably. o If GETATTR directed to the two filehandles returns different values for the fileid attribute, then they are distinct objects. o Otherwise they are the same object. 9.4. Open Delegation When a file is being OPENed, the server may delegate further handling of opens and closes for that file to the opening client. Any such delegation is recallable, since the circumstances that allowed for the delegation are subject to change. In particular, the server may receive a conflicting OPEN from another client, the server must recall the delegation before deciding whether the OPEN from the other client may be granted. Making a delegation is up to the server and clients should not assume that any particular OPEN either will or will not result in an open delegation. The following is a typical set of conditions that servers might use in deciding whether OPEN should be delegated: o The client must be able to respond to the server's callback requests. The server will use the CB_NULL procedure for a test of callback ability. o The client must have responded properly to previous recalls. o There must be no current open conflicting with the requested delegation. o There should be no current delegation that conflicts with the delegation being requested. o The probability of future conflicting open requests should be low based on the recent history of the file. Shepler, et al. Standards Track [Page 102] RFC 3530 NFS version 4 Protocol April 2003 o The existence of any server-specific semantics of OPEN/CLOSE that would make the required handling incompatible with the prescribed handling that the delegated client would apply (see below). There are two types of open delegations, read and write. A read open delegation allows a client to handle, on its own, requests to open a file for reading that do not deny read access to others. Multiple read open delegations may be outstanding simultaneously and do not conflict. A write open delegation allows the client to handle, on its own, all opens. Only one write open delegation may exist for a given file at a given time and it is inconsistent with any read open delegations. When a client has a read open delegation, it may not make any changes to the contents or attributes of the file but it is assured that no other client may do so. When a client has a write open delegation, it may modify the file data since no other client will be accessing the file's data. The client holding a write delegation may only affect file attributes which are intimately connected with the file data: size, time_modify, change. When a client has an open delegation, it does not send OPENs or CLOSEs to the server but updates the appropriate status internally. For a read open delegation, opens that cannot be handled locally (opens for write or that deny read access) must be sent to the server. When an open delegation is made, the response to the OPEN contains an open delegation structure which specifies the following: o the type of delegation (read or write) o space limitation information to control flushing of data on close (write open delegation only, see the section "Open Delegation and Data Caching") o an nfsace4 specifying read and write permissions o a stateid to represent the delegation for READ and WRITE The delegation stateid is separate and distinct from the stateid for the OPEN proper. The standard stateid, unlike the delegation stateid, is associated with a particular lock_owner and will continue to be valid after the delegation is recalled and the file remains open. Shepler, et al. Standards Track [Page 103] RFC 3530 NFS version 4 Protocol April 2003 When a request internal to the client is made to open a file and open delegation is in effect, it will be accepted or rejected solely on the basis of the following conditions. Any requirement for other checks to be made by the delegate should result in open delegation being denied so that the checks can be made by the server itself. o The access and deny bits for the request and the file as described in the section "Share Reservations". o The read and write permissions as determined below. The nfsace4 passed with delegation can be used to avoid frequent ACCESS calls. The permission check should be as follows: o If the nfsace4 indicates that the open may be done, then it should be granted without reference to the server. o If the nfsace4 indicates that the open may not be done, then an ACCESS request must be sent to the server to obtain the definitive answer. The server may return an nfsace4 that is more restrictive than the actual ACL of the file. This includes an nfsace4 that specifies denial of all access. Note that some common practices such as mapping the traditional user "root" to the user "nobody" may make it incorrect to return the actual ACL of the file in the delegation response. The use of delegation together with various other forms of caching creates the possibility that no server authentication will ever be performed for a given user since all of the user's requests might be satisfied locally. Where the client is depending on the server for authentication, the client should be sure authentication occurs for each user by use of the ACCESS operation. This should be the case even if an ACCESS operation would not be required otherwise. As mentioned before, the server may enforce frequent authentication by returning an nfsace4 denying all access with every open delegation. 9.4.1. Open Delegation and Data Caching OPEN delegation allows much of the message overhead associated with the opening and closing files to be eliminated. An open when an open delegation is in effect does not require that a validation message be sent to the server. The continued endurance of the "read open delegation" provides a guarantee that no OPEN for write and thus no write has occurred. Similarly, when closing a file opened for write and if write open delegation is in effect, the data written does not have to be flushed to the server until the open delegation is Shepler, et al. Standards Track [Page 104] RFC 3530 NFS version 4 Protocol April 2003 recalled. The continued endurance of the open delegation provides a guarantee that no open and thus no read or write has been done by another client. For the purposes of open delegation, READs and WRITEs done without an OPEN are treated as the functional equivalents of a corresponding type of OPEN. This refers to the READs and WRITEs that use the special stateids consisting of all zero bits or all one bits. Therefore, READs or WRITEs with a special stateid done by another client will force the server to recall a write open delegation. A WRITE with a special stateid done by another client will force a recall of read open delegations. With delegations, a client is able to avoid writing data to the server when the CLOSE of a file is serviced. The file close system call is the usual point at which the client is notified of a lack of stable storage for the modified file data generated by the application. At the close, file data is written to the server and through normal accounting the server is able to determine if the available filesystem space for the data has been exceeded (i.e., server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting includes quotas. The introduction of delegations requires that a alternative method be in place for the same type of communication to occur between client and server. In the delegation response, the server provides either the limit of the size of the file or the number of modified blocks and associated block size. The server must ensure that the client will be able to flush data to the server of a size equal to that provided in the original delegation. The server must make this assurance for all outstanding delegations. Therefore, the server must be careful in its management of available space for new or modified data taking into account available filesystem space and any applicable quotas. The server can recall delegations as a result of managing the available filesystem space. The client should abide by the server's state space limits for delegations. If the client exceeds the stated limits for the delegation, the server's behavior is undefined. Based on server conditions, quotas or available filesystem space, the server may grant write open delegations with very restrictive space limitations. The limitations may be defined in a way that will always force modified data to be flushed to the server on close. With respect to authentication, flushing modified data to the server after a CLOSE has occurred may be problematic. For example, the user of the application may have logged off the client and unexpired authentication credentials may not be present. In this case, the client may need to take special care to ensure that local unexpired Shepler, et al. Standards Track [Page 105] RFC 3530 NFS version 4 Protocol April 2003 credentials will in fact be available. This may be accomplished by tracking the expiration time of credentials and flushing data well in advance of their expiration or by making private copies of credentials to assure their availability when needed. 9.4.2. Open Delegation and File Locks When a client holds a write open delegation, lock operations may be performed locally. This includes those required for mandatory file locking. This can be done since the delegation implies that there can be no conflicting locks. Similarly, all of the revalidations that would normally be associated with obtaining locks and the flushing of data associated with the releasing of locks need not be done. When a client holds a read open delegation, lock operations are not performed locally. All lock operations, including those requesting non-exclusive locks, are sent to the server for resolution. 9.4.3. Handling of CB_GETATTR The server needs to employ special handling for a GETATTR where the target is a file that has a write open delegation in effect. The reason for this is that the client holding the write delegation may have modified the data and the server needs to reflect this change to the second client that submitted the GETATTR. Therefore, the client holding the write delegation needs to be interrogated. The server will use the CB_GETATTR operation. The only attributes that the server can reliably query via CB_GETATTR are size and change. Since CB_GETATTR is being used to satisfy another client's GETATTR request, the server only needs to know if the client holding the delegation has a modified version of the file. If the client's copy of the delegated file is not modified (data or size), the server can satisfy the second client's GETATTR request from the attributes stored locally at the server. If the file is modified, the server only needs to know about this modified state. If the server determines that the file is currently modified, it will respond to the second client's GETATTR as if the file had been modified locally at the server. Since the form of the change attribute is determined by the server and is opaque to the client, the client and server need to agree on a method of communicating the modified state of the file. For the size attribute, the client will report its current view of the file size. For the change attribute, the handling is more involved. Shepler, et al. Standards Track [Page 106] RFC 3530 NFS version 4 Protocol April 2003 For the client, the following steps will be taken when receiving a write delegation: o The value of the change attribute will be obtained from the server and cached. Let this value be represented by c. o The client will create a value greater than c that will be used for communicating modified data is held at the client. Let this value be represented by d. o When the client is queried via CB_GETATTR for the change attribute, it checks to see if it holds modified data. If the file is modified, the value d is returned for the change attribute value. If this file is not currently modified, the client returns the value c for the change attribute. For simplicity of implementation, the client MAY for each CB_GETATTR return the same value d. This is true even if, between successive CB_GETATTR operations, the client again modifies in the file's data or metadata in its cache. The client can return the same value because the only requirement is that the client be able to indicate to the server that the client holds modified data. Therefore, the value of d may always be c + 1. While the change attribute is opaque to the client in the sense that it has no idea what units of time, if any, the server is counting change with, it is not opaque in that the client has to treat it as an unsigned integer, and the server has to be able to see the results of the client's changes to that integer. Therefore, the server MUST encode the change attribute in network order when sending it to the client. The client MUST decode it from network order to its native order when receiving it and the client MUST encode it network order when sending it to the server. For this reason, change is defined as an unsigned integer rather than an opaque array of octets. For the server, the following steps will be taken when providing a write delegation: o Upon providing a write delegation, the server will cache a copy of the change attribute in the data structure it uses to record the delegation. Let this value be represented by sc. o When a second client sends a GETATTR operation on the same file to the server, the server obtains the change attribute from the first client. Let this value be cc. Shepler, et al. Standards Track [Page 107] RFC 3530 NFS version 4 Protocol April 2003 o If the value cc is equal to sc, the file is not modified and the server returns the current values for change, time_metadata, and time_modify (for example) to the second client. o If the value cc is NOT equal to sc, the file is currently modified at the first client and most likely will be modified at the server at a future time. The server then uses its current time to construct attribute values for time_metadata and time_modify. A new value of sc, which we will call nsc, is computed by the server, such that nsc &Niemi, et al. Expires July 26, 2012 [Page 8] Internet-Draft Multi-party Chat MSRP January 2012 +------+ | MSRP | |Client| +------+ +--.---+ +------+ | MSRP | | | MSRP | |Client| | _|Client| +------._ | ,' +------+ `._ | ,' `.. +----------+ ,' `| |' | MSRP | | Switch | ,| |_ _,-'' +----------+ ``-._ +------.-' | `--+------+ | MSRP | | | MSRP | |Client| | |Client| +------+ | +------+ +---'--+ | MSRP | |Client| +------+ Figure 2: Multi-party chat in a Centralized Conference Typically conference participants also subscribe to a conference event package to gather information about the conference roster in the form of conference state notifications. For example, participants can learn about other participants' identifiers, including their nicknames. All messages in the chat room use the 'Message/CPIM' wrapper content type [RFC3862], so that it is possible to distinguish between private and regular messages. When a participant wants to send an instant message to the conference, it constructs an MSRP SEND request and submits it to the MSRP switch including a regular payload (e.g. a Message/CPIM message that contains a text, HTML, an image, etc.). The Message/CPIM To header is set to the chat room URI. The switch then fans out the SEND request to all of the other participants using their existing MSRP sessions. A participant can also send a private instant message addressed to a participant whose identifier has been learned, e.g. via a conference event package. In this case the sender creates an MSRP SEND request with a Message/CPIM wrapper whose To header contains not the chat room URI but the recipient's URI. The MSRP switch then forwards the SEND request to that recipient. This specification supports the sending of private messages to one and only one recipient. However, Niemi, et al. Expires July 26, 2012 [Page 9] Internet-Draft Multi-party Chat MSRP January 2012 if the recipient is logged from different endpoints, the MSRP switch will distribute the private message to each endpoint the recipient is logged. We extend the current MSRP negotiation that takes place in SDP [RFC4566] to allow participants to learn whether the chat room supports and is willing to accept (e.g. due to local policy restrictions) certain MSRP functions defined in this memo, such as nicknames or private messaging. Naturally, when a participant wishes to leave a chat room, it sends a SIP BYE request to the Focus UA and terminates the SIP dialog with the focus and MSRP sessions with the MSRP switch. This document assumes that each chat room is allocated its own SIP URI. A user joining a chat room sends an INVITE request to that SIP URI, and as a result, a new MSRP session is established between the user and the MSRP switch. It is assumed that an MSRP session is mapped to a chat room. If a user wants to join a second chat room, he creates a different INVITE request, through a different SIP dialog, which leads to the creation of a second MSRP session between the user and the MSRP switch. Notice that these two MSRP sessions can still be multiplexed over the same TCP connection as per regular MSRP procedures. However, each chat room is associated to a unique MSRP session and a unique SIP dialog. 5. Creating, Joining, and Deleting a Chat Room 5.1. Creating a Chat Room Since we consider a chat room a particular type of conference having MSRP media, the methods defined by the SIP Conference Framework [RFC4353] for creating conferences are directly applicable to a chat room. Once a chat room is created, it is identified by a SIP URI, like any other conference. 5.2. Joining a Chat Room Participants usually join the conference by sending an INVITE request to the conference URI. As long as the conference policy allows, the INVITE request is accepted by the focus and the user is brought into the conference. The MSRP switch needs to be aware of the URIs of the participant (SIP, Tel, or IM URIs) in order to validate messages sent from this Niemi, et al. Expires July 26, 2012 [Page 10] Internet-Draft Multi-party Chat MSRP January 2012 participant prior to their forwarding. This information is known to the focus of the conference. Therefore an interface between the focus and the MSRP switch is assumed. However, the interface between the focus and the MSRP switch is outside the scope of this document. Conference aware participants will detect that the peer is a focus due to the presence of the "isfocus" feature tag [RFC3840] in the Contact header field of the 200-class response to the INVITE request. Conference unaware participants will not notice it is a focus, and can not apply the additional mechanisms defined in this document. Participants are also aware that the mixer is an MSRP switch due to the presence of a 'message' media type and either TCP/MSRP or TCP/ TLS/MSRP as the protocol field in the media line of SDP [RFC4566]. The conference focus of a chat room MUST include support for a Message/CPIM [RFC3862] top-level wrapper for the MSRP messages by setting the 'accept-types' MSRP media line attribute in the SDP offer or answer to include 'Message/CPIM'. Note that the 'Message/CPIM' wrapper is used to carry the sender information that, otherwise, it will not be available to the recipient. Additionally, 'Message/CPIM' wrapper carries the recipient information (e.g. To and Cc: headers). If a participant wants to remain anonymous to the rest of the participants in the conference, the participant's UA must provide an anonymous URI to the conference focus. The URI will be used in the From and To headers in the 'Message/CPIM' wrapper, and can be learned by the other participants of the conference. Notice that in order for the anonymity mechanism to work, the anonymous URI must not reveal the participant's SIP AOR. The mechanism for acquiring an anonymous URI is outside the scope of this specification. The conference focus of a chat room MUST learn the chat room capabilities of each participant that joins the chat room. The conference focus MUST inform the MSRP switch of such support in order to prevent the MSRP switch from distributing private messages to participants who do not support private messaging. The recipient would not be able to render the message as private, and any potential reply would be sent to the whole chat room. 5.3. Deleting a Chat Room As with creating a conference, the methods defined by the SIP Conference Framework [RFC4353] for deleting a conference are directly applicable to a chat room. The MSRP switch will terminate the MSRP sessions with all the participants. Niemi, et al. Expires July 26, 2012 [Page 11] Internet-Draft Multi-party Chat MSRP January 2012 Deleting a chat room is an action that heavily depends on the policy of the chat room. The policy can determine that the chat room is deleted when the creator leaves the conference, or with any out of band mechanism. 6. Sending and Receiving Instant Messages 6.1. Regular Messages This section describes the conventions used to send and receive instant messages that are addressed to all the participants in the chat room. These are sent over a regular MSRP SEND request that contains a Message/CPIM wrapper [RFC3862] that in turn contains the desired payload (e.g. text, image, video-clip, etc.). When a chat room participant wishes to send an instant message to all the other participants in the chat room, it constructs an MSRP SEND request according to the procedures specified in RFC 4975 [RFC4975]. The sender MAY choose the desired MSRP report model (e.g., populate the Success-Report and Failure-Report MSRP header fields). The SEND request MUST contain a top-level wrapper of type 'Message/ CPIM' according to RFC 3862 [RFC3862]. The actual instant message payload MUST be included as payload of the 'Message/CPIM' wrapper and MAY be of any type negotiated in the SDP 'accept-types' attribute according to the MSRP rules. On sending a regular message the sender MUST populate the To header of the Message/CPIM wrapper with the URI of the chat room. The sender SHOULD populate the From header of the Message/CPIM wrapper with a proper identifier by which the user is recognized in the conference. Identifiers that can be used (among others) are: o A SIP URI [RFC3261] representing the participant's address-of- record o A tel URI [RFC3966] representing the participant's telephone number o An IM URI [RFC3860] representing the participant's instant messaging address o An Anonymous URI representing the participant's anonymous address An MSRP switch that receives a SEND request from a participant SHOULD first verify that the From header field of the Message/CPIM wrapper is correctly populated with a valid URI of a participant. This Niemi, et al. Expires July 26, 2012 [Page 12] Internet-Draft Multi-party Chat MSRP January 2012 imposes a requirement for the focus of the conference to inform the MSRP switch of the URIs by which the participant is known, in order for the MSRP switch to validate messages. Section 6.3 provides further information with the actions to be taken in case this validation fails. Then the MSRP switch should inspect the To header field of the Message/CPIM wrapper. If the MSRP switch receives a message containing several To header fields in the Message/CPIM wrapper the MSRP switch MUST reject the MSRP SEND request with a 403 response, as per procedures in RFC 4975 [RFC4975]. Then, if the To header field of the Message/CPIM wrapper contains the chat room URI and there are no other To header fields, the MSRP switch can generate a copy of the SEND request to each of the participants in the conference except the sender. The MSRP switch MUST NOT modify the content received in the SEND request. However, the MSRP switch MAY re-chunk any of the outbound MSRP SEND requests. Note that the MSRP switch does not need to wait for the reception of the complete MSRP chunk or MSRP message before it starts the distribution to the rest of the participants. Instead, once the MSRP switch has received the headers of the Message/CPIM wrapper it SHOULD start the distribution process. Having the header of the Message/ CPIM wrapper only in the first chunk, the MSRP switch MUST track the Message-Id until the last chunk of the message has been distributed. An MSRP endpoint that receives a SEND request from the MSRP switch containing a Message/CPIM wrapper SHOULD first inspect the To header field of the Message/CPIM wrapper. If the To header field is set to the chat room URI, it should render it as a regular message that has been distributed to all the participants in the conference. Then the MSRP endpoint SHOULD inspect the From header field of the Message/ CPIM wrapper to identify the sender. The From header field will include a URI that identifies the sender. The endpoint might have also received further identifier information through a subscription to a conference event package. 6.2. Private Messages This section describes the conventions used to send and receive private instant messages, i.e., instant messages that are addressed to one participant of the chat room rather to all of them. A chat room can signal support for private messages using the 'chatroom' attribute in SDP (see Section 8 for details). When a chat room participant wishes to send a private instant message to a participant in the chat room, it follows the same procedures for creating a SEND request as for regular messages (Section 6.1). The Niemi, et al. Expires July 26, 2012 [Page 13] Internet-Draft Multi-party Chat MSRP January 2012 only difference is that the MSRP endpoint MUST populate a single To header of the Message/CPIM wrapper with the identifier of the intended recipient. The identifier can be SIP, TEL, and IM URIs typically learned from the information received in notifications of a conference event package. As for regular messages, an MSRP switch that receives a SEND request from a participant SHOULD first verify that the From header field of the Message/CPIM wrapper is correctly populated with a valid URI (i.e., the URI is a participant of this chat room). Section 6.3 provides further information with the actions to be taken in case this validation fails. Then the MSRP switch MUST inspect the To header field of the Message/ CPIM wrapper. If the MSRP switch receives a message containing several To header fields in the Message/CPIM wrapper the MSRP switch MUST reject the MSRP SEND request with a 403 response, as per procedures in RFC 4975 [RFC4975]. Then the MSRP switch MUST verify that the To header of the Message/CPIM wrapper matches the URI of a participant of the chat room. If this To header field does not contain the URI of a participant of the chat room or if the To header field cannot be resolved (e.g., caused by a mistyped URI), the MSRP switch MUST reject the request with a 404 response. This new 404 status code indicates a failure to resolve the recipient URI in the To header field of the Message/CPIM wrapper. Notice the importance of the From and To headers in the Message/ CPIM wrapper. If an intermediary modifies these values, the MSRP switch might not be able to identify the source or intended destination of the message, resulting in a rejection of the message. Finally, the MSRP switch MUST verify that the recipient supports private messages. If the recipient does not support private messages, the MSRP switch MUST reject the request with a 428 response. This new response 428 indicates that the recipient does not support private messages. Any potential REPORT request that the MSRP switch sends to the sender MUST include a Message/CPIM wrapper containing the original From header field included in the SEND request and the To header field of the original Message/CPIM wrapper. The MSRP switch MUST NOT forward private messages to a recipient that does not support private messaging. If successful, the MSRP switch should search its mapping table to find the MSRP sessions established towards the recipient. If a match is found the MSRP switch MUST create a SEND request and MUST copy the contents of the sender's message to it. Niemi, et al. Expires July 26, 2012 [Page 14] Internet-Draft Multi-party Chat MSRP January 2012 An MSRP endpoint that receives a SEND request from the MSRP switch does the same validations as for regular messages (Section 6.1). If the To header field is different from the chat room URI, the MSRP endpoints knows that this is a private message. The endpoint should render who it is from based on the value of the From header of the Message/CPIM wrapper. The endpoint can also use the sender's nickname, possibly learned via a conference event package, to render such nickname rather than the sender's actual URI. It is possible that a participant, identified by a SIP Address of Record or other valid URI, joins a conference of instant messages from two or more different SIP UAs. It is RECOMMENDED that the MSRP switch can map a URI to two or more MSRP sessions. If the policy of the server allows for this, the MSRP switch MUST copy all messages intended to the recipient through each MSRP session mapped to the recipient's URI. 6.3. MSRP reports and responses This section discusses the common procedures for regular and private messages with respect to MSRP reports and responses. Any particular procedure affecting only regular messages or only private messages is discussed in the previous Section 6.1 or Section 6.2, respectively. MSRP switches MUST follow the success report and failure report handling described in section 7 of RFC 4975 [RFC4975], complemented with the procedures described in this section. The MSRP switch MUST act as an MSRP endpoint receiver of the request according to section 5.3 of RFC 4975 [RFC4975]. If the MSRP switch receives an MSRP SEND request that does not contain a Message/CPIM wrapper, the MSRP switch MUST reject the request with a 415 response (specified in RFC 4975 [RFC4975]). If the MSRP switch receives an MSRP SEND request where the URI included in the From header field of the Message/CPIM wrapper is not valid, (e.g, because it does not "belong" to the sender of the message or is not a valid participant of the chat room), the MSRP switch MUST reject the request with a 403 response. In non-error cases, the MSRP switch MUST construct responses according to section 7.2 of RFC 4975 [RFC4975]. When the MSRP switch forwards a SEND request, it MAY use any report model in the copies intended for the recipients. The receiver reports from the recipients MUST NOT be forwarded to the originator of the original SEND request. This could lead to having the sender receiving multiple reports for a single MSRP request. Niemi, et al. Expires July 26, 2012 [Page 15] Internet-Draft Multi-party Chat MSRP January 2012 7. Nicknames A common characteristic of existing chat room services is that participants have the ability to present themselves with a nickname to the rest of the participants of the conference. It is used for easy reference of participants in the chat room, and can also provide anonymous participants with a meaningful descriptive name. A nickname is a useful construct in many use cases, of which MSRP chat is but one example. It is associated with a URI of which the participant is known to the focus. Therefore, if a user joins the chat room under the same URI from multiple devices, he or she may request the same nickname across all these devices. A nickname is a user selectable appearance of which the participant wants to be known to the other participants. It is not a 'display- name', but it is used somewhat like a display name. A main difference is that a nickname is unique inside a chat room to allow an unambiguous reference to a participant in the chat. Nicknames may be long lived, or may be temporary. Users also need to reserve a nickname prior to its utilization. This memo specifies the nickname as a string. The nickname string MUST be unambiguous within the scope of the chat room (conference instance). This scope is similar to having a nickname unique inside a chat room from Extensible Messaging and Presence Protocol [RFC6120]. The chat room may have policies associated with nicknames. It may not accept nickname strings at all, or a it may provide a wider unambiguous scope like a domain or server, similar to Internet Relay Chat (IRC) [RFC2810]. 7.1. Using Nicknames within a Conference This memo provides a mechanism to reserve a nickname for a participant for as long as the participant is logged into the chat room. The mechanism is based on a NICKNAME MSRP method (see below) and a new "Use-Nickname" header. Note that other mechanisms may exist (for example, a web page reservation system), although they are outside the scope of this document. A conference participant who has established an MSRP session with the MSRP switch, where the MSRP switch has indicated the support and availability of nicknames with the 'nicknames' token in the 'chatroom' SDP attribute, MAY send a NICKNAME request to the MSRP switch. The NICKNAME request MUST include a new Use-Nickname header that contains the nickname string that the participant wants to reserve. MSRP NICKNAME requests MUST NOT include Success-Report or Failure-Report header fields. Niemi, et al. Expires July 26, 2012 [Page 16] Internet-Draft Multi-party Chat MSRP January 2012 An MSRP switch that receives a NICKNAME request containing a nickname in the Use-Nickname header field SHOULD first verify whether the policy of the chat room allows the nickname functionality. If not allowed, the MSRP switch MUST reject the request with a 501 response, as per RFC 4975 [RFC4975]. If the policy of the chat room allows the usage of nicknames, the MSRP switch SHOULD validate that the SIP AOR is entitled to reserve the nickname. This may include, e.g., allowing that the participant's URI may use the same nickname when the participant has joined the chat room from different devices under the same URI. The participant's authenticated identifier can be derived after a successful SIP Digest Authentication [RFC3261], be included in a trusted SIP P-Asserted-Identity header field [RFC3325], be included in a valid SIP Identity header field [RFC4474], or be derived from any other present or future SIP authentication mechanism. Once the MSRP switch has validated that the participant is entitled to reserve the requested nickname, the MSRP switch MUST answer the NICKNAME request with a 200 response as per regular MSRP procedures. The reservation of a nickname can fail, e.g. if the NICKNAME request contains a malformed or non-existent Use-Nickname header field, or if the same nickname has already been reserved by another participant (i.e., by another URI) in the chat room. The validation can also fail where the sender of the message is not entitled to reserve the nickname. In any of these cases the MSRP switch MUST answer the NICKNAME request with a 423 response. The semantics of the 423 response are: "Nickname usage failed; the nickname is not allocated to this user". As indicated earlier, this specification defines a new MSRP header field: "Use-Nickname". The Use-Nickname header field carries a nickname string, and SHOULD be included in the NICKNAME requests. The syntax of the NICKNAME method and the "Use-Nickname" header field is built upon the MSRP formal syntax [RFC4975] ext-method =/ NICKNAMEm NICKNAMEm = %x4E.49.43.4B.4E.41.4D.45 ; NICKNAME in caps ext-header =/ Use-Nickname ; ext-header defined in RFC 4975 Use-Nickname = "Use-Nickname:" SP nickname nickname = quoted-string ; quoted-string defined in RFC 4975 Once the MSRP switch has reserved a nickname and has bound it to a URI (e.g., a SIP Address-of-Record), the MSRP server MAY allow the usage of the same nickname by the same user (identified by the same Niemi, et al. Expires July 26, 2012 [Page 17] Internet-Draft Multi-party Chat MSRP January 2012 URI, such as a SIP AoR) over a second MSRP session. This might be the case if the user joins the same chat room from a different SIP User Agent. In this case, the user MAY request the same or a different nickname than that used in conjunction with the first MSRP session; the MSRP server MAY accept the usage of the same nickname by the same user. The MSRP switch MUST NOT automatically assign the same nickname to more than one MSRP session established from the same URI, because this can create confusion to the user as whether the same nickname is bound to the second MSRP session. 7.2. Modifying a Nickname Typically a participant will reserve a nickname as soon as the participant joins the chat room. But it is also possible for a participant to modify his/her own nickname and replace it with a new one at any time during the duration of the MSRP session. Modification of the nickname is not different from the initial reservation and usage of a nickname, thus the NICKNAME method is used as described in Section 7.1. If a NICKNAME request that attempts to modify the current nickname of the user for some reason fails, the current nickname stays in effect. A new nickname comes into effect and the old one is released only after a NICKNAME request is accepted with a 200 response. 7.3. Removing a Nickname If the participant no longer wants to be known by a nickname in the conference, the participant can follow the method described in Section 7.2. The nickname element of the Use-Nickname header MUST be set to an empty quoted string. 7.4. Nicknames in Conference Event Packages Typically the conference focus acts as a notifier of the conference event package. To notify subscribers of the nickname reserved for a given participant, it is RECOMMENDED that conference focus and endpoints support Conference Event Package Data Format Extension for Centralized Conferencing [I-D.ietf-xcon-event-package]. The Conference Information Data Model for Centralized Conferencing [I-D.ietf-xcon-common-data-model] extends the user element from RFC 4575 [RFC4575] with a 'nickname' attribute. 8. The SDP 'chatroom' attribute There are a handful of use cases where a participant would like to learn the chat room capabilities supported by the MSRP switch and the gt;= sc + 1. The server then returns the constructed time_metadata, time_modify, and nsc values to the requester. The server replaces sc in the delegation record with nsc. To prevent the possibility of time_modify, time_metadata, and change from appearing to go backward (which would happen if the client holding the delegation fails to write its modified data to the server before the delegation is revoked or returned), the server SHOULD update the file's metadata record with the constructed attribute values. For reasons of reasonable performance, committing the constructed attribute values to stable storage is OPTIONAL. As discussed earlier in this section, the client MAY return the same cc value on subsequent CB_GETATTR calls, even if the file was modified in the client's cache yet again between successive CB_GETATTR calls. Therefore, the server must assume that the file has been modified yet again, and MUST take care to ensure that the new nsc it constructs and returns is greater than the previous nsc it returned. An example implementation's delegation record would satisfy this mandate by including a boolean field (let us call it "modified") that is set to false when the delegation is granted, and an sc value set at the time of grant to the change attribute value. The modified field would be set to true the first time cc != sc, and would stay true until the delegation is returned or revoked. The processing for constructing nsc, time_modify, and time_metadata would use this pseudo code: if (!modified) { do CB_GETATTR for change and size; if (cc != sc) modified = TRUE; } else { do CB_GETATTR for size; } if (modified) { sc = sc + 1; time_modify = time_metadata = current_time; Shepler, et al. Standards Track [Page 108] RFC 3530 NFS version 4 Protocol April 2003 update sc, time_modify, time_metadata into file's metadata; } return to client (that sent GETATTR) the attributes it requested, but make sure size comes from what CB_GETATTR returned. Do not update the file's metadata with the client's modified size. o In the case that the file attribute size is different than the server's current value, the server treats this as a modification regardless of the value of the change attribute retrieved via CB_GETATTR and responds to the second client as in the last step. This methodology resolves issues of clock differences between client and server and other scenarios where the use of CB_GETATTR break down. It should be noted that the server is under no obligation to use CB_GETATTR and therefore the server MAY simply recall the delegation to avoid its use. 9.4.4. Recall of Open Delegation The following events necessitate recall of an open delegation: o Potentially conflicting OPEN request (or READ/WRITE done with "special" stateid) o SETATTR issued by another client o REMOVE request for the file o RENAME request for the file as either source or target of the RENAME Whether a RENAME of a directory in the path leading to the file results in recall of an open delegation depends on the semantics of the server filesystem. If that filesystem denies such RENAMEs when a file is open, the recall must be performed to determine whether the file in question is, in fact, open. In addition to the situations above, the server may choose to recall open delegations at any time if resource constraints make it advisable to do so. Clients should always be prepared for the possibility of recall. Shepler, et al. Standards Track [Page 109] RFC 3530 NFS version 4 Protocol April 2003 When a client receives a recall for an open delegation, it needs to update state on the server before returning the delegation. These same updates must be done whenever a client chooses to return a delegation voluntarily. The following items of state need to be dealt with: o If the file associated with the delegation is no longer open and no previous CLOSE operation has been sent to the server, a CLOSE operation must be sent to the server. o If a file has other open references at the client, then OPEN operations must be sent to the server. The appropriate stateids will be provided by the server for subsequent use by the client since the delegation stateid will not longer be valid. These OPEN requests are done with the claim type of CLAIM_DELEGATE_CUR. This will allow the presentation of the delegation stateid so that the client can establish the appropriate rights to perform the OPEN. (see the section "Operation 18: OPEN" for details.) o If there are granted file locks, the corresponding LOCK operations need to be performed. This applies to the write open delegation case only. o For a write open delegation, if at the time of recall the file is not open for write, all modified data for the file must be flushed to the server. If the delegation had not existed, the client would have done this data flush before the CLOSE operation. o For a write open delegation when a file is still open at the time of recall, any modified data for the file needs to be flushed to the server. o With the write open delegation in place, it is possible that the file was truncated during the duration of the delegation. For example, the truncation could have occurred as a result of an OPEN UNCHECKED with a size attribute value of zero. Therefore, if a truncation of the file has occurred and this operation has not been propagated to the server, the truncation must occur before any modified data is written to the server. In the case of write open delegation, file locking imposes some additional requirements. To precisely maintain the associated invariant, it is required to flush any modified data in any region for which a write lock was released while the write delegation was in effect. However, because the write open delegation implies no other locking by other clients, a simpler implementation is to flush all modified data for the file (as described just above) if any write lock has been released while the write open delegation was in effect. Shepler, et al. Standards Track [Page 110] RFC 3530 NFS version 4 Protocol April 2003 An implementation need not wait until delegation recall (or deciding to voluntarily return a delegation) to perform any of the above actions, if implementation considerations (e.g., resource availability constraints) make that desirable. Generally, however, the fact that the actual open state of the file may continue to change makes it not worthwhile to send information about opens and closes to the server, except as part of delegation return. Only in the case of closing the open that resulted in obtaining the delegation would clients be likely to do this early, since, in that case, the close once done will not be undone. Regardless of the client's choices on scheduling these actions, all must be performed before the delegation is returned, including (when applicable) the close that corresponds to the open that resulted in the delegation. These actions can be performed either in previous requests or in previous operations in the same COMPOUND request. 9.4.5. Clients that Fail to Honor Delegation Recalls A client may fail to respond to a recall for various reasons, such as a failure of the callback path from server to the client. The client may be unaware of a failure in the callback path. This lack of awareness could result in the client finding out long after the failure that its delegation has been revoked, and another client has modified the data for which the client had a delegation. This is especially a problem for the client that held a write delegation. The server also has a dilemma in that the client that fails to respond to the recall might also be sending other NFS requests, including those that renew the lease before the lease expires. Without returning an error for those lease renewing operations, the server leads the client to believe that the delegation it has is in force. This difficulty is solved by the following rules: o When the callback path is down, the server MUST NOT revoke the delegation if one of the following occurs: - The client has issued a RENEW operation and the server has returned an NFS4ERR_CB_PATH_DOWN error. The server MUST renew the lease for any record locks and share reservations the client has that the server has known about (as opposed to those locks and share reservations the client has established but not yet sent to the server, due to the delegation). The server SHOULD give the client a reasonable time to return its delegations to the server before revoking the client's delegations. Shepler, et al. Standards Track [Page 111] RFC 3530 NFS version 4 Protocol April 2003 - The client has not issued a RENEW operation for some period of time after the server attempted to recall the delegation. This period of time MUST NOT be less than the value of the lease_time attribute. o When the client holds a delegation, it can not rely on operations, except for RENEW, that take a stateid, to renew delegation leases across callback path failures. The client that wants to keep delegations in force across callback path failures must use RENEW to do so. 9.4.6. Delegation Revocation At the point a delegation is revoked, if there are associated opens on the client, the applications holding these opens need to be notified. This notification usually occurs by returning errors for READ/WRITE operations or when a close is attempted for the open file. If no opens exist for the file at the point the delegation is revoked, then notification of the revocation is unnecessary. However, if there is modified data present at the client for the file, the user of the application should be notified. Unfortunately, it may not be possible to notify the user since active applications may not be present at the client. See the section "Revocation Recovery for Write Open Delegation" for additional details. 9.5. Data Caching and Revocation When locks and delegations are revoked, the assumptions upon which successful caching depend are no longer guaranteed. For any locks or share reservations that have been revoked, the corresponding owner needs to be notified. This notification includes applications with a file open that has a corresponding delegation which has been revoked. Cached data associated with the revocation must be removed from the client. In the case of modified data existing in the client's cache, that data must be removed from the client without it being written to the server. As mentioned, the assumptions made by the client are no longer valid at the point when a lock or delegation has been revoked. For example, another client may have been granted a conflicting lock after the revocation of the lock at the first client. Therefore, the data within the lock range may have been modified by the other client. Obviously, the first client is unable to guarantee to the application what has occurred to the file in the case of revocation. Notification to a lock owner will in many cases consist of simply returning an error on the next and all subsequent READs/WRITEs to the open file or on the close. Where the methods available to a client make such notification impossible because errors for certain Shepler, et al. Standards Track [Page 112] Niemi, et al. Expires July 26, 2012 [Page 18] Internet-Draft Multi-party Chat MSRP January 2012 chat room. For example, a participant would like to learn if the MSRP switch supports private messaging, otherwise, the participant may send what he believes is a private instant message addressed to a participant, but since the MSRP switch does not support the functions specified in this memo, the message gets eventually distributed to all the participants of the chat room. The reverse case also exists. A participant, say Alice, whose user agent does not support the extensions defined by this document joins the chat room. The MSRP switch learns that Alice's application does not support private messaging nor nicknames. If another participant, say Bob, sends a private message to Alice, the MSRP switch does not distribute it to Alice, because Alice is not able to differentiate it from a regular message sent to the whole roster. Furthermore, if Alice replied to this message, she would do it to the whole roster. Because of this, the MSRP switch also keeps track of users who do not support the extensions defined in this document. In another scenario, the policy of a chat room may indicate that certain functions are not allowed. For example, the policy may indicate that nicknames or private messages are not allowed. In order to provide the user with a good chat room experience, we define a new 'chatroom' SDP attribute. The 'chatroom' attribute is a media-level value attribute [RFC4566] that MAY be included in conjunction with an MSRP media stream (i.e., when an m= line in SDP indicates "TCP/MSRP" or "TCP/TLS/MSRP"). The 'chatroom' attribute without further modifiers (e.g., chat-tokens) indicates that the endpoint supports the procedures described in this document for transferring MSRP messages to/from a multi-party conference. The 'chatroom' attribute can be complemented with additional modifiers that further indicate the intersection of support and chat room local policy allowance for a number of functions specified in this document. Specifically, we provide the means for indicating support to use nicknames and private messaging. The 'chatroom' SDP attribute has the following Augmented BNF (ABNF) [RFC5234] syntax: Niemi, et al. Expires July 26, 2012 [Page 19] Internet-Draft Multi-party Chat MSRP January 2012 attribute =/ chatroom-attr ; attribute defined in RFC 4566 chatroom-attr = chatroom-label [":" chat-token *(SP chat-token)] chatroom-label = "chatroom" chat-token = (nicknames-token / private-msg-token / ext-token) nicknames-token = "nickname" private-msg-token = "private-messages" ext-token = private-token / standard-token private-token = toplabel "." *(domainlabel ".") token ; toplabel defined in RFC 3261 ; domainlabel defined in RFC 3261 ; token defined in RFC 3261 standard-token = token A given 'chat-token' value MUST NOT appear more than once in a 'chatroom' attribute. A conference focus that includes the 'nicknames' token in the session description is signaling that the MSRP switch supports and the chat room allows to use the procedures specified in Section 7. A conference focus that includes the 'private-messages' in the SDP description is signaling that the MSRP switch supports and the chat room allows to use the procedures specified in Section 6.2. Example of the 'chatroom' attribute for an MSRP media stream that indicates the acceptance of nicknames and private messages: a=chatroom:nickname private-messages An example of a 'chatroom' attribute for an MSRP media stream where the endpoint, e.g., an MSRP switch, does not allow either nicknames nor private messages. a=chatroom The 'chatroom' attribute allows extensibility with the addition of new tokens. No IANA registry is provided at this time, since no extensions are expected at the time of this writing. Extensions to the 'chatroom' attribute can be defined in IETF documents or as private vendor extensions. Extensions defined in IETF document MUST follow the 'standard-token' ABNF previously defined. In this type of extensions, are must be taken in the selection of the token to avoid a clash with any of the tokens previously defined. Niemi, et al. Expires July 26, 2012 [Page 20] Internet-Draft Multi-party Chat MSRP January 2012 Private extensions MUST follow the 'private-token' ABNF previously defined. The 'private-token' MUST include the DNS name of the vendor in reverse order in the token, in order to avoid clashes of tokens. The following is an example of a "chat.foo" extension by vendor "example.com" a=chatroom:nickname private-messages com.example.chat.foo 9. Examples 9.1. Joining a chat room Figure 3 presents a flow diagram where Alice joins a chat room by sending an INVITE request. This INVITE request contains a session description that includes the chatroom extensions defined in this document. Alice Conference focus | | |F1: (SIP) INVITE | |----------------------->| |F2: (SIP) 200 OK | |RFC 3530 NFS version 4 Protocol April 2003 operations may not be returned, more drastic action such as signals or process termination may be appropriate. The justification for this is that an invariant for which an application depends on may be violated. Depending on how errors are typically treated for the client operating environment, further levels of notification including logging, console messages, and GUI pop-ups may be appropriate. 9.5.1. Revocation Recovery for Write Open Delegation Revocation recovery for a write open delegation poses the special issue of modified data in the client cache while the file is not open. In this situation, any client which does not flush modified data to the server on each close must ensure that the user receives appropriate notification of the failure as a result of the revocation. Since such situations may require human action to correct problems, notification schemes in which the appropriate user or administrator is notified may be necessary. Logging and console messages are typical examples. If there is modified data on the client, it must not be flushed normally to the server. A client may attempt to provide a copy of the file data as modified during the delegation under a different name in the filesystem name space to ease recovery. Note that when the client can determine that the file has not been modified by any other client, or when the client has a complete cached copy of file in question, such a saved copy of the client's view of the file may be of particular value for recovery. In other case, recovery using a copy of the file based partially on the client's cached data and partially on the server copy as modified by other clients, will be anything but straightforward, so clients may avoid saving file contents in these situations or mark the results specially to warn users of possible problems. Saving of such modified data in delegation revocation situations may be limited to files of a certain size or might be used only when sufficient disk space is available within the target filesystem. Such saving may also be restricted to situations when the client has sufficient buffering resources to keep the cached copy available until it is properly stored to the target filesystem. 9.6. Attribute Caching The attributes discussed in this section do not include named attributes. Individual named attributes are analogous to files and caching of the data for these needs to be handled just as data Shepler, et al. Standards Track [Page 113] RFC 3530 NFS version 4 Protocol April 2003 caching is for ordinary files. Similarly, LOOKUP results from an OPENATTR directory are to be cached on the same basis as any other pathnames and similarly for directory contents. Clients may cache file attributes obtained from the server and use them to avoid subsequent GETATTR requests. Such caching is write through in that modification to file attributes is always done by means of requests to the server and should not be done locally and cached. The exception to this are modifications to attributes that are intimately connected with data caching. Therefore, extending a file by writing data to the local data cache is reflected immediately in the size as seen on the client without this change being immediately reflected on the server. Normally such changes are not propagated directly to the server but when the modified data is flushed to the server, analogous attribute changes are made on the server. When open delegation is in effect, the modified attributes may be returned to the server in the response to a CB_RECALL call. The result of local caching of attributes is that the attribute caches maintained on individual clients will not be coherent. Changes made in one order on the server may be seen in a different order on one client and in a third order on a different client. The typical filesystem application programming interfaces do not provide means to atomically modify or interrogate attributes for multiple files at the same time. The following rules provide an environment where the potential incoherences mentioned above can be reasonably managed. These rules are derived from the practice of previous NFS protocols. o All attributes for a given file (per-fsid attributes excepted) are cached as a unit at the client so that no non-serializability can arise within the context of a single file. o An upper time boundary is maintained on how long a client cache entry can be kept without being refreshed from the server. o When operations are performed that change attributes at the server, the updated attribute set is requested as part of the containing RPC. This includes directory operations that update attributes indirectly. This is accomplished by following the modifying operation with a GETATTR operation and then using the results of the GETATTR to update the client's cached attributes. Note that if the full set of attributes to be cached is requested by READDIR, the results can be cached by the client on the same basis as attributes obtained via GETATTR. Shepler, et al. Standards Track [Page 114] RFC 3530 NFS version 4 Protocol April 2003 A client may validate its cached version of attributes for a file by fetching just both the change and time_access attributes and assuming that if the change attribute has the same value as it did when the attributes were cached, then no attributes other than time_access have changed. The reason why time_access is also fetched is because many servers operate in environments where the operation that updates change does not update time_access. For example, POSIX file semantics do not update access time when a file is modified by the write system call. Therefore, the client that wants a current time_access value should fetch it with change during the attribute cache validation processing and update its cached time_access. The client may maintain a cache of modified attributes for those attributes intimately connected with data of modified regular files (size, time_modify, and change). Other than those three attributes, the client MUST NOT maintain a cache of modified attributes. Instead, attribute changes are immediately sent to the server. In some operating environments, the equivalent to time_access is expected to be implicitly updated by each read of the content of the file object. If an NFS client is caching the content of a file object, whether it is a regular file, directory, or symbolic link, the client SHOULD NOT update the time_access attribute (via SETATTR or a small READ or READDIR request) on the server with each read that is satisfied from cache. The reason is that this can defeat the performance benefits of caching content, especially since an explicit SETATTR of time_access may alter the change attribute on the server. If the change attribute changes, clients that are caching the content will think the content has changed, and will re-read unmodified data from the server. Nor is the client encouraged to maintain a modified version of time_access in its cache, since this would mean that the client will either eventually have to write the access time to the server with bad performance effects, or it would never update the server's time_access, thereby resulting in a situation where an application that caches access time between a close and open of the same file observes the access time oscillating between the past and present. The time_access attribute always means the time of last access to a file by a read that was satisfied by the server. This way clients will tend to see only time_access changes that go forward in time. 9.7. Data and Metadata Caching and Memory Mapped Files Some operating environments include the capability for an application to map a file's content into the application's address space. Each time the application accesses a memory location that corresponds to a block that has not been loaded into the address space, a page fault occurs and the file is read (or if the block does not exist in the Shepler, et al. Standards Track [Page 115] RFC 3530 NFS version 4 Protocol April 2003 file, the block is allocated and then instantiated in the application's address space). As long as each memory mapped access to the file requires a page fault, the relevant attributes of the file that are used to detect access and modification (time_access, time_metadata, time_modify, and change) will be updated. However, in many operating environments, when page faults are not required these attributes will not be updated on reads or updates to the file via memory access (regardless whether the file is local file or is being access remotely). A client or server MAY fail to update attributes of a file that is being accessed via memory mapped I/O. This has several implications: o If there is an application on the server that has memory mapped a file that a client is also accessing, the client may not be able to get a consistent value of the change attribute to determine whether its cache is stale or not. A server that knows that the file is memory mapped could always pessimistically return updated values for change so as to force the application to always get the most up to date data and metadata for the file. However, due to the negative performance implications of this, such behavior is OPTIONAL. o If the memory mapped file is not being modified on the server, and instead is just being read by an application via the memory mapped interface, the client will not see an updated time_access attribute. However, in many operating environments, neither will any process running on the server. Thus NFS clients are at no disadvantage with respect to local processes. o If there is another client that is memory mapping the file, and if that client is holding a write delegation, the same set of issues as discussed in the previous two bullet items apply. So, when a server does a CB_GETATTR to a file that the client has modified in its cache, the response from CB_GETATTR will not necessarily be accurate. As discussed earlier, the client's obligation is to report that the file has been modified since the delegation was granted, not whether it has been modified again between successive CB_GETATTR calls, and the server MUST assume that any file the client has modified in cache has been modified again between successive CB_GETATTR calls. Depending on the nature of the client's memory management system, this weak obligation may not be possible. A client MAY return stale information in CB_GETATTR whenever the file is memory mapped. o The mixture of memory mapping and file locking on the same file is problematic. Consider the following scenario, where the page size on each client is 8192 bytes. Shepler, et al. Standards Track [Page 116] RFC 3530 NFS version 4 Protocol April 2003 - Client A memory maps first page (8192 bytes) of file X - Client B memory maps first page (8192 bytes) of file X - Client A write locks first 4096 bytes - Client B write locks second 4096 bytes - Client A, via a STORE instruction modifies part of its locked region. - Simultaneous to client A, client B issues a STORE on part of its locked region. Here the challenge is for each client to resynchronize to get a correct view of the first page. In many operating environments, the virtual memory management systems on each client only know a page is modified, not that a subset of the page corresponding to the respective lock regions has been modified. So it is not possible for each client to do the right thing, which is to only write to the server that portion of the page that is locked. For example, if client A simply writes out the page, and then client B writes out the page, client A's data is lost. Moreover, if mandatory locking is enabled on the file, then we have a different problem. When clients A and B issue the STORE instructions, the resulting page faults require a record lock on the entire page. Each client then tries to extend their locked range to the entire page, which results in a deadlock. Communicating the NFS4ERR_DEADLOCK error to a STORE instruction is difficult at best. If a client is locking the entire memory mapped file, there is no problem with advisory or mandatory record locking, at least until the client unlocks a region in the middle of the file. Given the above issues the following are permitted: - Clients and servers MAY deny memory mapping a file they know there are record locks for. - Clients and servers MAY deny a record lock on a file they know is memory mapped. Shepler, et al. Standards Track [Page 117] RFC 3530 NFS version 4 Protocol April 2003 - A client MAY deny memory mapping a file that it knows requires mandatory locking for I/O. If mandatory locking is enabled after the file is opened and mapped, the client MAY deny the application further access to its mapped file. 9.8. Name Caching The results of LOOKUP and READDIR operations may be cached to avoid the cost of subsequent LOOKUP operations. Just as in the case of attribute caching, inconsistencies may arise among the various client caches. To mitigate the effects of these inconsistencies and given the context of typical filesystem APIs, an upper time boundary is maintained on how long a client name cache entry can be kept without verifying that the entry has not been made invalid by a directory change operation performed by another client. When a client is not making changes to a directory for which there exist name cache entries, the client needs to periodically fetch attributes for that directory to ensure that it is not being modified. After determining that no modification has occurred, the expiration time for the associated name cache entries may be updated to be the current time plus the name cache staleness bound. When a client is making changes to a given directory, it needs to determine whether there have been changes made to the directory by other clients. It does this by using the change attribute as reported before and after the directory operation in the associated change_info4 value returned for the operation. The server is able to communicate to the client whether the change_info4 data is provided atomically with respect to the directory operation. If the change values are provided atomically, the client is then able to compare the pre-operation change value with the change value in the client's name cache. If the comparison indicates that the directory was updated by another client, the name cache associated with the modified directory is purged from the client. If the comparison indicates no modification, the name cache can be updated on the client to reflect the directory operation and the associated timeout extended. The post-operation change value needs to be saved as the basis for future change_info4 comparisons. As demonstrated by the scenario above, name caching requires that the client revalidate name cache data by inspecting the change attribute of a directory at the point when the name cache item was cached. This requires that the server update the change attribute for directories when the contents of the corresponding directory is modified. For a client to use the change_info4 information appropriately and correctly, the server must report the pre and post operation change attribute values atomically. When the server is Shepler, et al. Standards Track [Page 118] RFC 3530 NFS version 4 Protocol April 2003 unable to report the before and after values atomically with respect to the directory operation, the server must indicate that fact in the change_info4 return value. When the information is not atomically reported, the client should not assume that other clients have not changed the directory. 9.9. Directory Caching The results of READDIR operations may be used to avoid subsequent READDIR operations. Just as in the cases of attribute and name caching, inconsistencies may arise among the various client caches. To mitigate the effects of these inconsistencies, and given the context of typical filesystem APIs, the following rules should be followed: o Cached READDIR information for a directory which is not obtained in a single READDIR operation must always be a consistent snapshot of directory contents. This is determined by using a GETATTR before the first READDIR and after the last of READDIR that contributes to the cache. o An upper time boundary is maintained to indicate the length of time a directory cache entry is considered valid before the client must revalidate the cached information. The revalidation technique parallels that discussed in the case of name caching. When the client is not changing the directory in question, checking the change attribute of the directory with GETATTR is adequate. The lifetime of the cache entry can be extended at these checkpoints. When a client is modifying the directory, the client needs to use the change_info4 data to determine whether there are other clients modifying the directory. If it is determined that no other client modifications are occurring, the client may update its directory cache to reflect its own changes. As demonstrated previously, directory caching requires that the client revalidate directory cache data by inspecting the change attribute of a directory at the point when the directory was cached. This requires that the server update the change attribute for directories when the contents of the corresponding directory is modified. For a client to use the change_info4 information appropriately and correctly, the server must report the pre and post operation change attribute values atomically. When the server is unable to report the before and after values atomically with respect to the directory operation, the server must indicate that fact in the change_info4 return value. When the information is not atomically reported, the client should not assume that other clients have not changed the directory. Shepler, et al. Standards Track [Page 119] RFC 3530 NFS version 4 Protocol April 2003 10. Minor Versioning To address the requirement of an NFS protocol that can evolve as the need arises, the NFS version 4 protocol contains the rules and framework to allow for future minor changes or versioning. The base assumption with respect to minor versioning is that any future accepted minor version must follow the IETF process and be documented in a standards track RFC. Therefore, each minor version number will correspond to an RFC. Minor version zero of the NFS version 4 protocol is represented by this RFC. The COMPOUND procedure will support the encoding of the minor version being requested by the client. The following items represent the basic rules for the development of minor versions. Note that a future minor version may decide to modify or add to the following rules as part of the minor version definition. 1. Procedures are not added or deleted To maintain the general RPC model, NFS version 4 minor versions will not add to or delete procedures from the NFS program. 2. Minor versions may add operations to the COMPOUND and CB_COMPOUND procedures. The addition of operations to the COMPOUND and CB_COMPOUND procedures does not affect the RPC model. 2.1 Minor versions may append attributes to GETATTR4args, bitmap4, and GETATTR4res. This allows for the expansion of the attribute model to allow for future growth or adaptation. 2.2 Minor version X must append any new attributes after the last documented attribute. Since attribute results are specified as an opaque array of per-attribute XDR encoded results, the complexity of adding new attributes in the midst of the current definitions will be too burdensome. 3. Minor versions must not modify the structure of an existing operation's arguments or results. Shepler, et al. Standards Track [Page 120] RFC 3530 NFS version 4 Protocol April 2003 Again the complexity of handling multiple structure definitions for a single operation is too burdensome. New operations should be added instead of modifying existing structures for a minor version. This rule does not preclude the following adaptations in a minor version. o adding bits to flag fields such as new attributes to GETATTR's bitmap4 data type o adding bits to existing attributes like ACLs that have flag words o extending enumerated types (including NFS4ERR_*) with new values 4. Minor versions may not modify the structure of existing attributes. 5. Minor versions may not delete operations. This prevents the potential reuse of a particular operation "slot" in a future minor version. 6. Minor versions may not delete attributes. 7. Minor versions may not delete flag bits or enumeration values. 8. Minor versions may declare an operation as mandatory to NOT implement. Specifying an operation as "mandatory to not implement" is equivalent to obsoleting an operation. For the client, it means that the operation should not be sent to the server. For the server, an NFS error can be returned as opposed to "dropping" the request as an XDR decode error. This approach allows for the obsolescence of an operation while maintaining its structure so that a future minor version can reintroduce the operation. 8.1 Minor versions may declare attributes mandatory to NOT implement. 8.2 Minor versions may declare flag bits or enumeration values as mandatory to NOT implement. 9. Minor versions may downgrade features from mandatory to recommended, or recommended to optional. Shepler, et al. Standards Track [Page 121] RFC 3530 NFS version 4 Protocol April 2003 10. Minor versions may upgrade features from optional to recommended or recommended to mandatory. 11. A client and server that support minor version X must support minor versions 0 (zero) through X-1 as well. 12. No new features may be introduced as mandatory in a minor version. This rule allows for the introduction of new functionality and forces the use of implementation experience before designating a feature as mandatory. 13. A client MUST NOT attempt to use a stateid, filehandle, or similar returned object from the COMPOUND procedure with minor version X for another COMPOUND procedure with minor version Y, where X != Y. 11. Internationalization The primary issue in which NFS version 4 needs to deal with internationalization, or I18N, is with respect to file names and other strings as used within the protocol. The choice of string representation must allow reasonable name/string access to clients which use various languages. The UTF-8 encoding of the UCS as defined by [ISO10646] allows for this type of access and follows the policy described in "IETF Policy on Character Sets and Languages", [RFC2277]. [RFC3454], otherwise know as "stringprep", documents a framework for using Unicode/UTF-8 in networking protocols, so as "to increase the likelihood that string input and string comparison work in ways that make sense for typical users throughout the world." A protocol must define a profile of stringprep "in order to fully specify the processing options." The remainder of this Internationalization section defines the NFS version 4 stringprep profiles. Much of terminology used for the remainder of this section comes from stringprep. There are three UTF-8 string types defined for NFS version 4: utf8str_cs, utf8str_cis, and utf8str_mixed. Separate profiles are defined for each. Each profile defines the following, as required by stringprep: o The intended applicability of the profile Shepler, et al. Standards Track [Page 122] RFC 3530 NFS version 4 Protocol April 2003 o The character repertoire that is the input and output to stringprep (which is Unicode 3.2 for referenced version of stringprep) o The mapping tables from stringprep used (as described in section 3 of stringprep) o Any additional mapping tables specific to the profile o The Unicode normalization used, if any (as described in section 4 of stringprep) o The tables from stringprep listing of characters that are prohibited as output (as described in section 5 of stringprep) o The bidirectional string testing used, if any (as described in section 6 of stringprep) o Any additional characters that are prohibited as output specific to the profile Stringprep discusses Unicode characters, whereas NFS version 4 renders UTF-8 characters. Since there is a one to one mapping from UTF-8 to Unicode, where ever the remainder of this document refers to to Unicode, the reader should assume UTF-8. Much of the text for the profiles comes from [RFC3454]. 11.1. Stringprep profile for the utf8str_cs type Every use of the utf8str_cs type definition in the NFS version 4 protocol specification follows the profile named nfs4_cs_prep. 11.1.1. Intended applicability of the nfs4_cs_prep profile The utf8str_cs type is a case sensitive string of UTF-8 characters. Its primary use in NFS Version 4 is for naming components and pathnames. Components and pathnames are stored on the server&<-----------------------| |F3: (SIP) ACK | |----------------------->| | | Figure 3: Flow diagram of a user joining a chat room F1: Alice constructs an SDP description that includes an MSRP media stream. She also indicates her support for the chatroom extensions defined in this document. She sends the INVITE request to the chat room server. Niemi, et al. Expires July 26, 2012 [Page 21] Internet-Draft Multi-party Chat MSRP January 2012 INVITE sip:chatroom22@chat.example.com SIP/2.0 Via: SIP/2.0/TCP client.atlanta.example.com:5060;branch=z9hG4bK74bf9 Max-Forwards: 70 From: Alice <sip:alice@atlanta.example.com>;tag=9fxced76sl To: Chatroom 22 <sip:chatroom22@chat.example.com> Call-ID: 3848276298220188511@atlanta.example.com CSeq: 1 INVITE Contact: <sip:alice@client.atlanta.example.com;transport=tcp> Content-Type: application/sdp Content-Length: 290 v=0 o=alice 2890844526 2890844526 IN IP4 client.atlanta.example.com s=- c=IN IP4 client.atlanta.example.com m=message 7654 TCP/MSRP * a=accept-types:message/cpim text/plain text/html a=path:msrp://client.atlanta.example.com:7654/jshA7weztas;tcp a=chatroom:nickname private-messages F2: The chat room server accepts the session establishment. It includes the 'isfocus' and other relevant feature tags in the Contact header field of the response. The chat room server also builds an SDP answer that forces the reception of messages wrapped in Message/ CPIM wrappers. It also includes the 'chatroom' attribute with the allowed extensions. SIP/2.0 200 OK Via: SIP/2.0/TCP client.atlanta.example.com:5060;branch=z9hG4bK74bf9 ;received=192.0.2.101 From: Alice <sip:alice@atlanta.example.com>;tag=9fxced76sl To: Chatroom 22 <sip:chatroom22@chat.example.com>;tag=8321234356 Call-ID: 3848276298220188511@atlanta.example.com CSeq: 1 INVITE Contact: <sip:chatroom22@chat.example.com;transport=tcp> \ ;methods="INVITE,BYE,OPTIONS,ACK,CANCEL,SUBSCRIBE,NOTIFY" \ ;automata;isfocus;message;event="conference" Content-Type: application/sdp Content-Length: 290 v=0 o=chat 2890844527 2890844527 IN IP4 chat.example.com s=- c=IN IP4 chat.example.com m=message 12763 TCP/MSRP * a=accept-types:message/cpim a=accept-wrapped-types:text/plain text/html * a=path:msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp Niemi, et al. Expires July 26, 2012 [Page 22] Internet-Draft Multi-party Chat MSRP January 2012 a=chatroom:nickname private-messages F3: The session established is acknowledged (details not shown). 9.2. Setting up a nickname Figure 4 shows an example of Alice setting up a nickname using the conference as provider. Her first proposal is not accepted because that proposed nickname is already in use. Then, she makes a second proposal with a new nickname. This second proposal is accepted. Alice MSRP switch | | |F1: (MSRP) NICKNAME | |----------------------->| |F2: (MSRP) 423 | |<-----------------------| |F3: (MSRP) NICKNAME | |----------------------->| |F4: (MSRP) 200 | |<-----------------------| | | Figure 4: Flow diagram of a user setting up her nickname F1: Alice sends an MSRP NICKNAME request that contains her proposed nicknames in the Use-Nickname header field. MSRP d93kswow NICKNAME To-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp Use-Nickname: "Alice the great" -------d93kswow$ F2: The MSRP switch analyzes the existing allocation of nicknames and detects that the nickname "Alice the great" is already provided to another participant in the chat room. The MSRP switch answers with a 423 response. MSRP d93kswow 423 Nickname usage failed To-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp From-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp -------d93kswow$ F3: Alice receives the response. She proposes a new nickname in a second NICKNAME request. Niemi, et al. Expires July 26, 2012 [Page 23] Internet-Draft Multi-party Chat MSRP January 2012 MSRP 09swk2d NICKNAME To-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp Use-Nickname: "Alice in Wonderland" -------09swk2d$ F4: The MSRP switch accepts the nickname proposal and answers with a 200 response. MSRP 09swk2d 200 OK To-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp From-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp -------09swk2d$ 9.3. Sending a regular message to the chat room Figure 5 depicts a flow diagram where Alice is sending a regular message addressed to the chat room. The MSRP switch distributes the message to the rest of the participants. Alice MSRP switch Bob Charlie | | | | | F1: (MSRP) SEND | | | |--------------------->| F3: (MSRP) SEND | | | F2: (MSRP) 200 |----------------------->| | |<---------------------| F4: (MSRP) SEND | | | |------------------------------->| | | F5: (MSRP) 200 OK | | | |<-----------------------| | | | F6: (MSRP) 200 OK | | | |<------------------------------ | | | | | | | | | Figure 5: Sending a regular message to the chat room F1: Alice builds a text message and wraps it in a Message/CPIM wrapper. She addresses the message to the chat room. She encloses the resulting Message/CPIM wrapper in an MSRP SEND request and sends it to the MSRP switch via the existing TCP connection. Niemi, et al. Expires July 26, 2012 [Page 24] Internet-Draft Multi-party Chat MSRP January 2012 MSRP 3490visdm SEND To-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp Message-ID: 99s9s2 Byte-Range: 1-*/* Content-Type: message/cpim To: <sip:chatroom22@chat.example.com;transport=tcp> From: <sip:alice@atlanta.example.com> DateTime: 2009-03-02T15:02:31-03:00 Content-Type: text/plain Hello guys, how are you today? -------3490visdm$ F2: The MSRP switch acknowledges the reception of the SEND request with a 200 (OK) response. MSRP 3490visdm 200 OK To-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp From-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp Message-ID: 99s9s2 -------3490visdm$ F3: The MSRP switch creates a new MSRP SEND request that contains the received Message/CPIM wrapper and sends it to Bob. MSRP 490ej23 SEND To-Path: msrp://client.biloxi.example.com:4923/49dufdje2;tcp From-Path: msrp://chat.example.com:5678/jofofo3;tcp Message-ID: 304sse2 Byte-Range: 1-*/* Content-Type: message/cpim To: <sip:chatroom22@chat.example.com;transport=tcp> From: <sip:alice@atlanta.example.com> DateTime: 2009-03-02T15:02:31-03:00 Content-Type: text/plain Hello guys, how are you today? -------490ej23$ Since the received message is addressed to the chat room URI in the From header of the Message/CPIM header, Bob knows that this is a regular message distributed all participants in the chat room, rather that a private message addressed to him. The rest of the message flows are analogous to the previous. They Niemi, et al. Expires July 26, 2012 [Page 25] Internet-Draft Multi-party Chat MSRP January 2012 are not shown here. 9.4. Sending a private message to a participant Figure 6 depicts a flow diagram where Alice is sending a private message addressed to Bob's SIP AOR. The MSRP switch distributes the message only to Bob. Alice MSRP switch Bob | | | | F1: (MSRP) SEND | | |--------------------->| F3: (MSRP) SEND | | F2: (MSRP) 200 |----------------------->| |<---------------------| F4: (MSRP) 200 | | |<-----------------------| | | | Figure 6: Sending a private message to Bob F1: Alice builds a text message and wraps it in a Message/CPIM wrapper. She addresses the message to Bob's URI, which she learned from a notification in the conference event package. She encloses the resulting Message/CPIM wrapper in an MSRP SEND request and sends it to the MSRP switch via the existing TCP connection. MSRP 6959ssdf SEND To-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp Message-ID: okj3kw Byte-Range: 1-*/* Content-Type: message/cpim To: <sip:bob@example.com> From: <sip:alice@example.com> DateTime: 2009-03-02T15:02:31-03:00 Content-Type: text/plain Hello Bob. -------6959ssdf$ F2: The MSRP switch acknowledges the reception of the SEND request with a 200 (OK) response. MSRP 6959ssdfm 200 OK To-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp From-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp Message-ID: okj3kw -------6959ssdfm$ Niemi, et al. Expires July 26, 2012 [Page 26] Internet-Draft Multi-party Chat MSRP January 2012 F3: The MSRP switch creates a new MSRP SEND request that contains the received Message/CPIM wrapper and sends it only to Bob. Bob can distinguish the sender in the From header of the Message/CPIM wrapper. He also identifies this as a private message due to the presence of his own SIP AOR in the To header field of the Message/ CPIM wrapper. MSRP 9v9s2 SEND To-Path: msrp://client.biloxi.example.com:4923/49dufdje2;tcp From-Path: msrp://chat.example.com:5678/jofofo3;tcp Message-ID: d9fghe982 Byte-Range: 1-*/* Content-Type: message/cpim To: <sip:bob@example.com> From: <sip:alice@atlanta.example.com> DateTime: 2009-03-02T15:02:31-03:00 Content-Type: text/plain Hello Bob. -------9v9s2$ F4: Bob acknowledges the reception of the SEND request with a 200 (OK) response. MSRP 9v9s2 200 OK To-Path: msrp://chat.example.com:5678/jofofo3;tcp From-Path: msrp://client.biloxi.example.com:4923/49dufdje2;tcp Message-ID: d9fghe982 -------9v9s2$ 9.5. Chunked private message The MSRP message below depicts the example of the same private message described in Section 9.4, but now the message is split in two chunks. The MSRP switch must wait for the complete set of Message/ CPIM headers before distributing the messages. Niemi, et al. Expires July 26, 2012 [Page 27]#x27;s filesystem. Two valid distinct UTF-8 strings might be the same after processing via the utf8str_cs profile. If the strings are two names inside a directory, the NFS version 4 server will need to either: o disallow the creation of a second name if it's post processed form collides with that of an existing name, or o allow the creation of the second name, but arrange so that after post processing, the second name is different than the post processed form of the first name. Shepler, et al. Standards Track [Page 123] RFC 3530 NFS version 4 Protocol April 2003 11.1.2. Character repertoire of nfs4_cs_prep The nfs4_cs_prep profile uses Unicode 3.2, as defined in stringprep's Appendix A.1 11.1.3. Mapping used by nfs4_cs_prep The nfs4_cs_prep profile specifies mapping using the following tables from stringprep: Table B.1 Table B.2 is normally not part of the nfs4_cs_prep profile as it is primarily for dealing with case-insensitive comparisons. However, if the NFS version 4 file server supports the case_insensitive filesystem attribute, and if case_insensitive is true, the NFS version 4 server MUST use Table B.2 (in addition to Table B1) when processing utf8str_cs strings, and the NFS version 4 client MUST assume Table B.2 (in addition to Table B.1) are being used. If the case_preserving attribute is present and set to false, then the NFS version 4 server MUST use table B.2 to map case when processing utf8str_cs strings. Whether the server maps from lower to upper case or the upper to lower case is an implementation dependency. 11.1.4. Normalization used by nfs4_cs_prep The nfs4_cs_prep profile does not specify a normalization form. A later revision of this specification may specify a particular normalization form. Therefore, the server and client can expect that they may receive unnormalized characters within protocol requests and responses. If the operating environment requires normalization, then the implementation must normalize utf8str_cs strings within the protocol before presenting the information to an application (at the client) or local filesystem (at the server). Shepler, et al. Standards Track [Page 124] RFC 3530 NFS version 4 Protocol April 2003 11.1.5. Prohibited output for nfs4_cs_prep The nfs4_cs_prep profile specifies prohibiting using the following tables from stringprep: Table C.3 Table C.4 Table C.5 Table C.6 Table C.7 Table C.8 Table C.9 11.1.6. Bidirectional output for nfs4_cs_prep The nfs4_cs_prep profile does not specify any checking of bidirectional strings. 11.2. Stringprep profile for the utf8str_cis type Every use of the utf8str_cis type definition in the NFS version 4 protocol specification follows the profile named nfs4_cis_prep. 11.2.1. Intended applicability of the nfs4_cis_prep profile The utf8str_cis type is a case insensitive string of UTF-8 characters. Its primary use in NFS Version 4 is for naming NFS servers. 11.2.2. Character repertoire of nfs4_cis_prep The nfs4_cis_prep profile uses Unicode 3.2, as defined in stringprep's Appendix A.1 11.2.3. Mapping used by nfs4_cis_prep The nfs4_cis_prep profile specifies mapping using the following tables from stringprep: Table B.1 Table B.2 11.2.4. Normalization used by nfs4_cis_prep The nfs4_cis_prep profile specifies using Unicode normalization form KC, as described in stringprep. Shepler, et al. Standards Track [Page 125] RFC 3530 NFS version 4 Protocol April 2003 11.2.5. Prohibited output for nfs4_cis_prep The nfs4_cis_prep profile specifies prohibiting using the following tables from stringprep: Table C.1.2 Table C.2.2 Table C.3 Table C.4 Table C.5 Table C.6 Table C.7 Table C.8 Table C.9 11.2.6. Bidirectional output for nfs4_cis_prep The nfs4_cis_prep profile specifies checking bidirectional strings as described in stringprep's section 6. 11.3. Stringprep profile for the utf8str_mixed type Every use of the utf8str_mixed type definition in the NFS version 4 protocol specification follows the profile named nfs4_mixed_prep. 11.3.1. Intended applicability of the nfs4_mixed_prep profile The utf8str_mixed type is a string of UTF-8 characters, with a prefix that is case sensitive, a separator equal to '@', and a suffix that is fully qualified domain name. Its primary use in NFS Version 4 is for naming principals identified in an Access Control Entry. 11.3.2. Character repertoire of nfs4_mixed_prep The nfs4_mixed_prep profile uses Unicode 3.2, as defined in stringprep's Appendix A.1 11.3.3. Mapping used by nfs4_cis_prep For the prefix and the separator of a utf8str_mixed string, the nfs4_mixed_prep profile specifies mapping using the following table from stringprep: Table B.1 For the suffix of a utf8str_mixed string, the nfs4_mixed_prep profile specifies mapping using the following tables from stringprep: Shepler, et al. Standards Track [Page 126] RFC 3530 NFS version 4 Protocol April 2003 Table B.1 Table B.2 11.3.4. Normalization used by nfs4_mixed_prep The nfs4_mixed_prep profile specifies using Unicode normalization form KC, as described in stringprep. 11.3.5. Prohibited output for nfs4_mixed_prep The nfs4_mixed_prep profile specifies prohibiting using the following tables from stringprep: Table C.1.2 Table C.2.2 Table C.3 Table C.4 Table C.5 Table C.6 Table C.7 Table C.8 Table C.9 11.3.6. Bidirectional output for nfs4_mixed_prep The nfs4_mixed_prep profile specifies checking bidirectional strings as described in stringprep's section 6. 11.4. UTF-8 Related Errors Where the client sends an invalid UTF-8 string, the server should return an NFS4ERR_INVAL error. This includes cases in which inappropriate prefixes are detected and where the count includes trailing bytes that do not constitute a full UCS character. Where the client supplied string is valid UTF-8 but contains characters that are not supported by the server as a value for that string (e.g., names containing characters that have more than two octets on a filesystem that supports Unicode characters only), the server should return an NFS4ERR_BADCHAR error. Where a UTF-8 string is used as a file name, and the filesystem, while supporting all of the characters within the name, does not allow that particular name to be used, the server should return the error NFS4ERR_BADNAME. This includes situations in which the server filesystem imposes a normalization constraint on name strings, but Shepler, et al. Standards Track [Page 127] RFC 3530 NFS version 4 Protocol April 2003 will also include such situations as filesystem prohibitions of "." and ".." as file names for certain operations, and other such constraints. 12. Error Definitions NFS error numbers are assigned to failed operations within a compound request. A compound request contains a number of NFS operations that have their results encoded in sequence in a compound reply. The results of successful operations will consist of an NFS4_OK status followed by the encoded results of the operation. If an NFS operation fails, an error status will be entered in the reply and the compound request will be terminated. A description of each defined error follows: NFS4_OK Indicates the operation completed successfully. NFS4ERR_ACCESS Permission denied. The caller does not have the correct permission to perform the requested operation. Contrast this with NFS4ERR_PERM, which restricts itself to owner or privileged user permission failures. NFS4ERR_ATTRNOTSUPP An attribute specified is not supported by the server. Does not apply to the GETATTR operation. NFS4ERR_ADMIN_REVOKED Due to administrator intervention, the lockowner's record locks, share reservations, and delegations have been revoked by the server. NFS4ERR_BADCHAR A UTF-8 string contains a character which is not supported by the server in the context in which it being used. NFS4ERR_BAD_COOKIE READDIR cookie is stale. NFS4ERR_BADHANDLE Illegal NFS filehandle. The filehandle failed internal consistency checks. NFS4ERR_BADNAME A name string in a request consists of valid UTF-8 characters supported by the server but the name is not supported by the server as a valid name for current operation. Shepler, et al. Standards Track [Page 128] RFC 3530 NFS version 4 Protocol April 2003 Internet-Draft Multi-party Chat MSRP January 2012 MSRP 7443ruls SEND To-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp Message-ID: aft4to Byte-Range: 1-*/174 Content-Type: message/cpim To: <sip:bob@example.com> From: <sip:alice@example.com> -------7443ruls$ MSRP 7443ruls SEND To-Path: msrp://chat.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://client.atlanta.example.com:7654/jshA7weztas;tcp Message-ID: aft4to Byte-Range: 68-174/174 Content-Type: message/cpim DateTime: 2009-03-02T15:02:31-03:00 Content-Type: text/plain Hello Bob -------7443ruls$ 9.6. Nickname in a conference information document Figure 7 depicts two user elements in a conference information document both having the nickname element with a nickname string. Niemi, et al. Expires July 26, 2012 [Page 28] Internet-Draft Multi-party Chat MSRP January 2012 <?xml version="1.0" encoding="UTF-8"?> <conference-info xmlns="urn:ietf:params:xml:ns:conference-info" entity="sip:chatroom22@chat.example.com" state="full" version="1"> <!-- CONFERENCE INFO --> <conference-description> <subject>MSRP nickname example</subject> </conference-description> <!-- CONFERENCE STATE --> <conference-state> <user-count>2</user-count> </conference-state> <!-- USERS --> <users> <user entity="sip:bob@example.com" state="full"> <nickname>Dopey Donkey</nickname> </user> <!-- USER --> <user entity="sip:alice@atlanta.example.com" state="full"> <nickname>Alice the great</nickname> </user> </users> </conference-info> Figure 7: Nickname in a conference information document 10. IANA Considerations 10.1. New MSRP Method This specification defines a new MSRP method to be added to the Methods sub-registry of the Message Session Relay Protocol (MSRP) Parameters registry: NICKNAME See section Section 7 for details. Niemi, et al. Expires July 26, 2012 [Page 29] Internet-Draft Multi-party Chat MSRP January 2012 10.2. New MSRP Header This specification defines a new MSRP header to be added to the Header Field sub-registry of the Message Session Relay Protocol (MSRP) Parameters registry: Use-Nickname See Section 7 for details. 10.3. New MSRP Status Codes This specification defines three new MSRP status codes to be added to the Status-Code sub-registry of the Message Session Relay Protocol (MSRP) parameters registry. The 404 status code indicates the failure to resolve the recipient URI in the To header field of the Message/CPIM wrapper in the SEND request, e.g, due to an unknown recipient. See Section 6.2 for details. The 423 response indicates a failure in allocating the requested NICKNAME. This can be caused by a malformed NICKNAME request (e.g., no Use-Nickname header field), an already allocated nickname, or a policy that prevents the sender to use nicknames. See Section 7 for details. The 428 status code indicates that the recipient of a SEND request does not support private messages. See Section 6.2 for details. Table 1 summarizes the IANA registration data with respect to new MSRP status codes: +-------+---------------------------------------+-----------+ | Value | Description | Reference | +-------+---------------------------------------+-----------+ | 404 | Failure to resolve recipient's URI | RFC XXXX | | 423 | Unable to allocate requested nickname | RFC XXXX | | 428 | Private messages not supported | RFC XXXX | +-------+---------------------------------------+-----------+ Table 1: New status codes 10.4. New SDP Attribute This specification defines a new media-level attribute in the Session Description Protocol (SDP) Parameters registry. The registration data is as follows: Niemi, et al. Expires July 26, 2012 [Page 30] Internet-Draft Multi-party Chat MSRP January 2012 Contact: Miguel Garcia <miguel.a.garcia@ericsson.com> Phone: +34 91 339 1000 Attribute name: chatroom Long-form attribute name: Chat Room Type of attribute: media level only This attribute is not subject to the charset attribute Description: This attribute identifies support and local policy allowance for a number of chat room related functions Specification: RFC XXXX See section Section 8 for details. 11. Security Considerations This document proposes extensions to the Message Session Relay Protocol [RFC4975]. Therefore, the security considerations of that document apply to this document as well. If the participant's SIP user agent doesn't understand the "isfocus" feature tag [RFC3840], it will not know that it is connected to a conference instance. The participant might not be notified that the participant's MSRP client will try to send messages to the MSRP switch having potentially multiple recipients. If the participant's MSRP client doesn't support the extensions of this specification, it is unlikely that it will try to send a message using 'Message/CPIM' wrapper content type [RFC3862], and the MSRP switch will reject the request with a 415 response [RFC4975]. Still if a participant's MSRP client does create a message with a valid 'Message/CPIM' wrapper content type [RFC3862] having the To header set to the URI of the chat room and the From header set to the URI of which the participant is known to the conference, the participant might be unaware that the message can be forwarded to multiple recipients. Equally if the To header is set to a valid URI of a recipient known to the conference, the message can be forwarded as a private message without the participant knowing. If a participant wants to avoid eavesdropping, the participant&NFS4ERR_BADOWNER An owner, owner_group, or ACL attribute value can not be translated to local representation. NFS4ERR_BADTYPE An attempt was made to create an object of a type not supported by the server. NFS4ERR_BAD_RANGE The range for a LOCK, LOCKT, or LOCKU operation is not appropriate to the allowable range of offsets for the server. NFS4ERR_BAD_SEQID The sequence number in a locking request is neither the next expected number or the last number processed. NFS4ERR_BAD_STATEID A stateid generated by the current server instance, but which does not designate any locking state (either current or superseded) for a current lockowner-file pair, was used. NFS4ERR_BADXDR The server encountered an XDR decoding error while processing an operation. NFS4ERR_CLID_INUSE The SETCLIENTID operation has found that a client id is already in use by another client. NFS4ERR_DEADLOCK The server has been able to determine a file locking deadlock condition for a blocking lock request. NFS4ERR_DELAY The server initiated the request, but was not able to complete it in a timely fashion. The client should wait and then try the request with a new RPC transaction ID. For example, this error should be returned from a server that supports hierarchical storage and receives a request to process a file that has been migrated. In this case, the server should start the immigration process and respond to client with this error. This error may also occur when a necessary delegation recall makes processing a request in a timely fashion impossible. NFS4ERR_DENIED An attempt to lock a file is denied. Since this may be a temporary condition, the client is encouraged to retry the lock request until the lock is accepted. Shepler, et al. Standards Track [Page 129] RFC 3530 NFS version 4 Protocol April 2003 NFS4ERR_DQUOT Resource (quota) hard limit exceeded. The user's resource limit on the server has been exceeded. NFS4ERR_EXIST File exists. The file specified already exists. NFS4ERR_EXPIRED A lease has expired that is being used in the current operation. NFS4ERR_FBIG File too large. The operation would have caused a file to grow beyond the server's limit. NFS4ERR_FHEXPIRED The filehandle provided is volatile and has expired at the server. NFS4ERR_FILE_OPEN The operation can not be successfully processed because a file involved in the operation is currently open. NFS4ERR_GRACE The server is in its recovery or grace period which should match the lease period of the server. NFS4ERR_INVAL Invalid argument or unsupported argument for an operation. Two examples are attempting a READLINK on an object other than a symbolic link or specifying a value for an enum field that is not defined in the protocol (e.g., nfs_ftype4). NFS4ERR_IO I/O error. A hard error (for example, a disk error) occurred while processing the requested operation. NFS4ERR_ISDIR Is a directory. The caller specified a directory in a non-directory operation. NFS4ERR_LEASE_MOVED A lease being renewed is associated with a filesystem that has been migrated to a new server. NFS4ERR_LOCKED A read or write operation was attempted on a locked file. NFS4ERR_LOCK_NOTSUPP Server does not support atomic upgrade or downgrade of locks. Shepler, et al. Standards Track [Page 130] RFC 3530 NFS version 4 Protocol April 2003 NFS4ERR_LOCK_RANGE A lock request is operating on a sub-range of a current lock for the lock owner and the server does not support this type of request. NFS4ERR_LOCKS_HELD A CLOSE was attempted and file locks would exist after the CLOSE. NFS4ERR_MINOR_VERS_MISMATCH The server has received a request that specifies an unsupported minor version. The server must return a COMPOUND4res with a zero length operations result array. NFS4ERR_MLINK Too many hard links. NFS4ERR_MOVED The filesystem which contains the current filehandle object has been relocated or migrated to another server. The client may obtain the new filesystem location by obtaining the "fs_locations" attribute for the current filehandle. For further discussion, refer to the section "Filesystem Migration or Relocation". NFS4ERR_NAMETOOLONG The filename in an operation was too long. NFS4ERR_NOENT No such file or directory. The file or directory name specified does not exist. NFS4ERR_NOFILEHANDLE The logical current filehandle value (or, in the case of RESTOREFH, the saved filehandle value) has not been set properly. This may be a result of a malformed COMPOUND operation (i.e., no PUTFH or PUTROOTFH before an operation that requires the current filehandle be set). NFS4ERR_NO_GRACE A reclaim of client state has fallen outside of the grace period of the server. As a result, the server can not guarantee that conflicting state has not been provided to another client. NFS4ERR_NOSPC No space left on device. The operation would have caused the server's filesystem to exceed its limit. NFS4ERR_NOTDIR Not a directory. The caller specified a non- directory in a directory operation. Shepler, et al. Standards Track [Page 131] RFC 3530 NFS version 4 Protocol April 2003 NFS4ERR_NOTEMPTY An attempt was made to remove a directory that was not empty. NFS4ERR_NOTSUPP Operation is not supported. NFS4ERR_NOT_SAME This error is returned by the VERIFY operation to signify that the attributes compared were not the same as provided in the client's request. NFS4ERR_NXIO I/O error. No such device or address. NFS4ERR_OLD_STATEID A stateid which designates the locking state for a lockowner-file at an earlier time was used. NFS4ERR_OPENMODE The client attempted a READ, WRITE, LOCK or SETATTR operation not sanctioned by the stateid passed (e.g., writing to a file opened only for read). NFS4ERR_OP_ILLEGAL An illegal operation value has been specified in the argop field of a COMPOUND or CB_COMPOUND procedure. NFS4ERR_PERM Not owner. The operation was not allowed because the caller is either not a privileged user (root) or not the owner of the target of the operation. NFS4ERR_RECLAIM_BAD The reclaim provided by the client does not match any of the server's state consistency checks and is bad. NFS4ERR_RECLAIM_CONFLICT The reclaim provided by the client has encountered a conflict and can not be provided. Potentially indicates a misbehaving client. NFS4ERR_RESOURCE For the processing of the COMPOUND procedure, the server may exhaust available resources and can not continue processing operations within the COMPOUND procedure. This error will be returned from the server in those instances of resource exhaustion related to the processing of the COMPOUND procedure. Shepler, et al. Standards Track [Page 132] RFC 3530 NFS version 4 Protocol April 2003 NFS4ERR_RESTOREFH The RESTOREFH operation does not have a saved filehandle (identified by SAVEFH) to operate upon. NFS4ERR_ROFS Read-only filesystem. A modifying operation was attempted on a read-only filesystem. NFS4ERR_SAME This error is returned by the NVERIFY operation to signify that the attributes compared were the same as provided in the client's request. NFS4ERR_SERVERFAULT An error occurred on the server which does not map to any of the legal NFS version 4 protocol error values. The client should translate this into an appropriate error. UNIX clients may choose to translate this to EIO. NFS4ERR_SHARE_DENIED An attempt to OPEN a file with a share reservation has failed because of a share conflict. NFS4ERR_STALE Invalid filehandle. The filehandle given in the arguments was invalid. The file referred to by that filehandle no longer exists or access to it has been revoked. NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was used in a locking or SETCLIENTID_CONFIRM request. NFS4ERR_STALE_STATEID A stateid generated by an earlier server instance was used. NFS4ERR_SYMLINK The current filehandle provided for a LOOKUP is not a directory but a symbolic link. Also used if the final component of the OPEN path is a symbolic link. NFS4ERR_TOOSMALL The encoded response to a READDIR request exceeds the size limit set by the initial request. NFS4ERR_WRONGSEC The security mechanism being used by the client for the operation does not match the server's security policy. The client should change the security mechanism being used and retry the operation. Shepler, et al. Standards Track [Page 133] RFC 3530 NFS version 4 Protocol April 2003 NFS4ERR_XDEV Attempt to do an operation between different fsids. 13. NFS version 4 Requests For the NFS version 4 RPC program, there are two traditional RPC procedures: NULL and COMPOUND. All other functionality is defined as a set of operations and these operations are defined in normal XDR/RPC syntax and semantics. However, these operations are encapsulated within the COMPOUND procedure. This requires that the client combine one or more of the NFS version 4 operations into a single request. The NFS4_CALLBACK program is used to provide server to client signaling and is constructed in a similar fashion as the NFS version 4 program. The procedures CB_NULL and CB_COMPOUND are defined in the same way as NULL and COMPOUND are within the NFS program. The CB_COMPOUND request also encapsulates the remaining operations of the NFS4_CALLBACK program. There is no predefined RPC program number for the NFS4_CALLBACK program. It is up to the client to specify a program number in the "transient" program range. The program and port number of the NFS4_CALLBACK program are provided by the client as part of the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The program and port can be changed by another SETCLIENTID/SETCLIENTID_CONFIRM sequence, and it is possible to use the sequence to change them within a client incarnation without removing relevant leased client state. 13.1. Compound Procedure The COMPOUND procedure provides the opportunity for better performance within high latency networks. The client can avoid cumulative latency of multiple RPCs by combining multiple dependent operations into a single COMPOUND procedure. A compound operation may provide for protocol simplification by allowing the client to combine basic procedures into a single request that is customized for the client's environment. The CB_COMPOUND procedure precisely parallels the features of COMPOUND as described above. The basic structure of the COMPOUND procedure is: +-----+--------------+--------+-----------+-----------+-----------+-- | tag | minorversion | numops | op + args | op + args | op + args | +-----+--------------+--------+-----------+-----------+-----------+-- #x27;s MSRP client can send the messages over a TLS [RFC5246] transport connection, as allowed by MSRP. It's up to the policy of the MSRP switch if the messages are forwarded to the other participant's in Niemi, et al. Expires July 26, 2012 [Page 31] Internet-Draft Multi-party Chat MSRP January 2012 the chat room using TLS [RFC5246] transport. Nicknames will be used to show the appearances of the participants of the conference. A successful take over of a nickname from a participant might lead to private messages to be sent to the wrong destination. The recipient's URI will be different from the URI associated to the original owner of the nickname, but the sender might not notice this. To avoid takeovers the MSRP switch MUST make sure that a nickname is unique inside a chat room. Also the security consideration for any authenticated identity mechanisms used to validate the SIP AOR will apply to this document as well. If a nickname can be reserved if it previously has been used by another participant in the chat room, is up to the policy of the chat room. 12. Contributors This work would have never been possible without the fruitful discussions in the SIMPLE WG mailing list, specially with Brian Rosen (Neustar) and Paul Kyzivat (Cisco), who provided extensive review and improvements throughout the document. 13. Acknowledgments The authors want to thank Eva Leppanen, Adamu Haruna, Adam Roach, Matt Lepinski, Mary Barnes, Ben Campbell, Paul Kyzivat, Adrian Georgescu, Nancy Greene, and Flemming Andreasen for providing comments. 14. References 14.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [RFC3840] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Indicating User Agent Capabilities in the Session Initiation Protocol (SIP)", RFC 3840, August 2004. [RFC3860] Peterson, J., "Common Profile for Instant Messaging Niemi, et al. Expires July 26, 2012 [Page 32] Internet-Draft Multi-party Chat MSRP January 2012 (CPIM)", RFC 3860, August 2004. [RFC3862] Klyne, G. and D. Atkins, "Common Presence and Instant Messaging (CPIM): Message Format", RFC 3862, August 2004. [RFC4353] Rosenberg, J., "A Framework for Conferencing with the Session Initiation Protocol (SIP)", RFC 4353, February 2006. [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session Initiation Protocol (SIP) Event Package for Conference State", RFC 4575, August 2006. [RFC4975] Campbell, B., Mahy, R., and C. Jennings, "The Message Session Relay Protocol (MSRP)", RFC 4975, September 2007. [RFC4976] Jennings, C., Mahy, R., and A. Roach, "Relay Extensions for the Message Sessions Relay Protocol (MSRP)", RFC 4976, September 2007. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. [RFC5239] Barnes, M., Boulton, C., and O. Levin, "A Framework for Centralized Conferencing", RFC 5239, June 2008. [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008. [I-D.ietf-xcon-common-data-model] Novo, O., Camarillo, G., Morgan, D., and J. Urpalainen, "Conference Information Data Model for Centralized Conferencing (XCON)", draft-ietf-xcon-common-data-model-32 (work in progress), September 2011. [I-D.ietf-xcon-event-package] Camarillo, G., Srinivasan, S., Even, R., and J. Urpalainen, "Conference Event Package Data Format Extension for Centralized Conferencing (XCON)", draft-ietf-xcon-event-package-01 (work in progress), September 2008. Niemi, et al. Expires July 26, 2012 [Page 33] Internet-Draft Multi-party Chat MSRP January 2012 14.2. Informative References [RFC2810] Kalt, C., "Internet Relay Chat: Architecture", RFC 2810, April 2000. [RFC3325] Jennings, C., Peterson, J., and M. Watson, "Private Extensions to the Session Initiation Protocol (SIP) for Asserted Identity within Trusted Networks", RFC 3325, November 2002. [RFC3966] Schulzrinne, H., "The tel URI for Telephone Numbers", RFC 3966, December 2004. [RFC4474] Peterson, J. and C. Jennings, "Enhancements for Authenticated Identity Management in the Session Initiation Protocol (SIP)", RFC 4474, August 2006. [RFC6120] Saint-Andre, P., "Extensible Messaging and Presence Protocol (XMPP): Core", RFC 6120, March 2011. Authors' Addresses Aki Niemi Nokia P.O. Box 407 NOKIA GROUP, FIN 00045 Finland Phone: +358 50 389 1644 Email: aki.niemi@nokia.com Miguel A. Garcia-Martin Ericsson Calle Via de los Poblados 13 Madrid, ES 28033 Spain Email: miguel.a.garcia@ericsson.com Niemi, et al. Expires July 26, 2012 [Page 34] Internet-Draft Multi-party Chat MSRP January 2012 Geir A. Sandbakken (editor) Cisco Systems Philip Pedersens vei 20 N-1366 Lysaker Norway Phone: +47 67 125 125 Email: geirsand@cisco.com URI: http://www.cisco.com Niemi, et al. Expires July 26, 2012 [Page 35]