Skip to main content

NFSv4 migration: Implementation Experience and Specification Issues
draft-ietf-nfsv4-migration-issues-12

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Expired".
Authors David Noveck , Piyush Shivam , Chuck Lever , Bill Baker
Last updated 2017-03-30 (Latest revision 2017-02-08)
RFC stream Internet Engineering Task Force (IETF)
Formats
Additional resources Mailing list discussion
Stream WG state WG Document
Document shepherd (None)
IESG IESG state AD is watching
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD Spencer Dawkins
Send notices to (None)
draft-ietf-nfsv4-migration-issues-12
NFSv4                                                     D. Noveck, Ed.
Internet-Draft                                                    NetApp
Intended status: Informational                                 P. Shivam
Expires: September 30, 2017                                     C. Lever
                                                                B. Baker
                                                                  ORACLE
                                                          March 29, 2017

  NFSv4 migration: Implementation Experience and Specification Issues
                  draft-ietf-nfsv4-migration-issues-12

Abstract

   The migration feature of NFSv4 provides for moving responsibility for
   a single filesystem from one server to another, without disruption to
   clients.  A number of problems in the specification of this feature
   in NFSv4.0 were resolved by the publication of RFC 7931.  In
   addition, there are specification issues to be resolved with regard
   to the NFSv4.1 version of this feature which are discussed in this
   document.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 30, 2017.

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents

Noveck, et al.         Expires September 30, 2017               [Page 1]
Internet-Draft              nfsv4-migr-issues                 March 2017

   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  NFSv4.0 Issues and Their Resolution . . . . . . . . . . . . .   3
     3.1.  NFSv4.0 Issues  . . . . . . . . . . . . . . . . . . . . .   3
     3.2.  Resolution of NFSv4.0 Protocol Difficulties . . . . . . .   4
   4.  Issues for NFSv4.1  . . . . . . . . . . . . . . . . . . . . .   5
     4.1.  Issues to Address for NFSv4.1 . . . . . . . . . . . . . .   5
       4.1.1.  Addressing state merger in NFSv4.1  . . . . . . . . .   6
       4.1.2.  Addressing pNFS relationship with migration . . . . .   7
       4.1.3.  Addressing server_owner changes in NFSv4.1  . . . . .   7
       4.1.4.  Addressing Confirmation Status of Migrated
               Client IDs in NFSv4.1 . . . . . . . . . . . . . . . .   8
       4.1.5.  Addressing Session Migration in NFSv4.1 . . . . . . .   9
     4.2.  Possible Resolutions for NFSv4.1 Issues . . . . . . . . .   9
       4.2.1.  Server Responsibilities in Effecting Transparent
               State Migration . . . . . . . . . . . . . . . . . . .  10
       4.2.2.  Determining Initial Migration Status in NFSv4.1 . . .  11
       4.2.3.  Client Response to Migration in NFSv4.1 . . . . . . .  13
       4.2.4.  Dealing with Multiple Location Entries  . . . . . . .  13
       4.2.5.  Client Recovery from Migration Events . . . . . . . .  15
       4.2.6.  The Migration Discovery Process . . . . . . . . . . .  18
       4.2.7.  Synchronzing Session Transfer . . . . . . . . . . . .  19
       4.2.8.  Migration and pNFS  . . . . . . . . . . . . . . . . .  22
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  23
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  23
   7.  Normative References  . . . . . . . . . . . . . . . . . . . .  23
   Appendix A.  Acknowledgements . . . . . . . . . . . . . . . . . .  23
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  24

1.  Introduction

   This document. which deals with existing issues/problems in
   standards-track documents, is in the informational category, and
   while the facts it reports may have normative implications, any such
   normative significance reflects the readers' preferences.  For
   example, we may report that the existing definition of migration for
   NFSv4.1 does not properly describe how migrating state is to be
   merged with existing state for the destination server.  While it is
   to be expected that client and server implementers will judge this to
   be a situation that is best avoided, the judgment as to how pressing

Noveck, et al.         Expires September 30, 2017               [Page 2]
Internet-Draft              nfsv4-migr-issues                 March 2017

   this issue should be considered is a judgment for the reader, and
   eventually the nfsv4 working group to make.

   We do explore possible ways in which such issues can be avoided, with
   minimal negative effects, given that the working group has decided to
   address these issues, but the choice of exactly how to address these
   is best given effect in one or more standards-track documents and/or
   errata.

   This document focuses on NFSv4.1, since the analogous issues for
   NFSv4.0 have already been addressed by the publication of [RFC7931].
   Nevertheless, the history of these issues in NFSv4.0 is presented,
   since understanding the similarities and differences between these
   protocols may be helpful in deciding how best to address remaining
   issues.

2.  Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   In the context of this informational document, these normative
   keywords will always occur in the context of a quotation, most often
   direct but sometimes indirect.  The context will make it clear
   whether the quotation is from:

   o  The previously current definitive definition of the NFSv4.0
      protocol [RFC7530].

   o  The current definitive definition of the NFSv4.1 protocol
      [RFC5661].

   o  A proposed or possible text to serve as a replacement for the
      current or previous definitive document text.  Sometimes, a number
      of possible alternative texts may be listed and benefits and
      detriments of each examined in turn.

3.  NFSv4.0 Issues and Their Resolution

3.1.  NFSv4.0 Issues

   Many of the problems seen with Transparent State Migration derived
   from the inability of servers to determine whether two client IDs,
   issued on different servers, corresponded to the same client.  This
   difficulty derived in turn from the common practice, recommended by
   [RFC7530], in which each client presented different client

Noveck, et al.         Expires September 30, 2017               [Page 3]
Internet-Draft              nfsv4-migr-issues                 March 2017

   identification strings to different servers, rather than presenting
   the same identification string to all servers.

   This practice, later referred to as the "non-uniform" client string
   approach, derived from concern that, since NFSv4.0 provided no means
   to determine whether two IP addresses correspond to the server, a
   single client connected to both might be confused by the fact that
   state changes made via one IP address might unexpectedly affect the
   state maintained with respect to the second IP address, thought of as
   a separate server

   To avoid this unexpected behavior, clients used the non-uniform
   client id string approach.  By doing so, a client connected to two
   different servers (or to two IP addresses connected to the same
   server) appeared to be two different servers.  Since the server is
   under the impression that two different clients are involved, state
   changes made on each distinct IP address cannot be reflected on
   another.

   However, by doing things this way, state migrated from server to
   server cannot be referred to the actual client which generated it,
   leading to confusion.

   In addition to this core problem, the following issues with regard to
   Transparent State Migration needed to be addressed:

   o  Clarification regarding the ability to merge state from different
      leases even though their expiration times might not be precisely
      synchronized.

   o  Clarifying the treatment of client IDs since it is not always
      clear when clientid4 and when nfs_client_id4 was intended.

   o  Clarifying the logic of returning NFS4ERR_LEASE_MOVED.

   o  Clarifying the handling NFS4ERR_CLID_INUSE.

3.2.  Resolution of NFSv4.0 Protocol Difficulties

   The client string identification issue was addressed in [RFC7931] as
   follows:

   o  Defining both the uniform and non-uniform client id string
      approaches as valid choices but indicating that the latter posed
      difficulties for Transparent Stare Migration.

   o  Providing a way that clients could use to determine whether two IP
      addresses are connected to the same server.

Noveck, et al.         Expires September 30, 2017               [Page 4]
Internet-Draft              nfsv4-migr-issues                 March 2017

   o  Allowing clients using the uniform approach to avoid negative
      consequences due to otherwise unexpected behavior since behavior
      that is a consequence of known trunking relationships is not
      unexpected.

   o  As a result, servers migrating state are aware of the fact that
      the same client is associated with two different items of state
      even when that state was originally created on two different
      servers.

   Since all of the other issues noted in Section 3.1 were also
   addressed, publication of [RFC7931] updating [RFC7530] addressed all
   known issues with Transparent State Migration in NFSv4.0.

4.  Issues for NFSv4.1

4.1.  Issues to Address for NFSv4.1

   Because NFSv4.1 embraces the uniform client-string approach, as
   advised by section 2.4 of [RFC5661], addressing migration issues is
   simpler, in that a shift in client id string models is not required.
   Instead, NFSv4 returns information in the EXCHANGE_ID response to
   enable trunking relationships to be determined by the client.

   The other necessary part of addressing migration issues, providing
   for the server's merger of leases that relate to the same client, is
   not currently addressed by [RFC5661] and changes need to be made to
   make it clear that state needs to be appropriately merged as part of
   migration, to avoid multiple client IDs between a client-server pair.

   In addition, there are a number of new features within NFSv4.1 whose
   relationship with migration needs to be clarified.  Some examples:

   o  The interaction of trunking with migration and other aspects of
      multi-server namespace needs to be clarified.

   o  There needs to be some clarification of how migration, and
      particularly Transparent State Migration, should interact with
      pNFS layouts.

   o  The current discussion (in [RFC5661]), of the possibility of
      server_owner changes is incomplete and confusing.

   o  The expected confirmation status of client IDs transferred by
      Transparent State Migration needs to be clarified.

   o  There are a number of issues related to the migration of sessions
      that need to be addressed

Noveck, et al.         Expires September 30, 2017               [Page 5]
Internet-Draft              nfsv4-migr-issues                 March 2017

   Discussion of how to resolve these issues will appear in the sections
   below.

4.1.1.  Addressing state merger in NFSv4.1

   The existing treatment of state transfer in [RFC5661], has similar
   problems to that in [RFC7530] in that it assumes that the state for
   multiple filesystems formerly on different servers will not be merged
   so that it appears under a single common client ID.  We've already
   seen the reasons that this is a problem with regard to NFSv4.0.

   Although we don't have the problems stemming from the non-uniform
   client-string approach, there are a number of complexities in the
   existing treatment of state management in the section entitled "Lock
   State and File System Transitions" in [RFC5661] that make this non-
   trivial to address:

   o  Migration is currently treated together with other sorts of
      filesystem transitions including transitioning between replicas
      without any NFS4ERR_MOVED errors.

   o  There is separate handling and discussion of the cases of matching
      and non-matching server scopes.

   o  In the case of matching server scopes, the text calls for an
      unrealistic degree of transparency, suggesting that the source and
      destination servers need to cooperate in stateid assignment.

   o  In the case of non-matching server scopes, the text does not
      mention the possibility of the transparent migration of state at
      all, resulting in a functional regression from NFSV4.0

   o  The potential interaction between migration and trunking has not
      been addressed.

   o  There is insufficient attention to the question of how clients can
      deal with the complexities of recovering from migration.  As part
      of this, the implications of the shift of lease migration
      notification shifting from an error (NFS4ERR_LEASE_MOVED in
      NFSv4.0) to status bit (SEQ4_STATUS_LEASE_MOVED in NFSv4.1) need
      to be explored.

   To summarize, there is a need for an NFSv4.1 treatment of Transparent
   State Migration that is an extension of that in [RFC7931] and that
   includes appropriate handling for NFSv4.1 features such as trunking.

Noveck, et al.         Expires September 30, 2017               [Page 6]
Internet-Draft              nfsv4-migr-issues                 March 2017

4.1.2.  Addressing pNFS relationship with migration

   This is made difficult because, within the pNFS framework, migration
   might mean any of several things:

   o  Transfer of the MDS, leaving DS's as they are.

      This would be minimally disruptive to those using layouts but
      would require the pNFS control protocol being used to support the
      DS being directed to a new MDS.

   o  Transfer of a DS, leaving everything else in place.

      Such a transfer can be handled without using migration at all.
      The server can recall/revoke layouts, and issue new ones, as
      appropriate.

   o  Transfer of the filesystem to a new filesystem with both MDS and
      DS's moving.

      In such a transfer, an entirely different set of DS's will be at
      the target location.  There may even be no pNFS support on the
      destination filesystem at all.

   Migration needs to support both the first and last of these models.

4.1.3.  Addressing server_owner changes in NFSv4.1

   Section 2.10.5 of [RFC5661] states the following.

      The client should be prepared for the possibility that
      eir_server_owner values may be different on subsequent EXCHANGE_ID
      requests made to the same network address, as a result of various
      sorts of reconfiguration events.  When this happens and the
      changes result in the invalidation of previously valid forms of
      trunking, the client should cease to use those forms, either by
      dropping connections or by adding sessions.  For a discussion of
      lock reclaim as it relates to such reconfiguration events, see
      Section 8.4.2.1.

   While this paragraph is literally true in that such reconfiguration
   events can happen and clients have to deal with them, it is confusing
   in that it can be read as suggesting that clients have to deal with
   them without disruption, which in general is impossible.

   A clearer alternative would be:

Noveck, et al.         Expires September 30, 2017               [Page 7]
Internet-Draft              nfsv4-migr-issues                 March 2017

      It is always possible that, as a result of various sorts of
      reconfiguration events, eir_server_scope and eir_server_owner
      values may be different on subsequent EXCHANGE_ID requests made to
      the same network address.

      In most cases such reconfiguration events will be disruptive and
      indicate that an IP address formerly connected to one server is
      now connected to an entirely different one.

      Some guidelines on client handling of such situations follow:

      o  When eir_server_scope changes, the client has no assurance that
         any id's it obtained previously (e.g. file handles) can be
         validly used on the new server, and, even if the new server
         accepts them, there is no assurance that this is not due to
         accident.  Thus it is best to treat all such state as lost/
         stale although a client may assume that the probability of
         inadvertent acceptance is low and treat this situation as
         within the next case.

      o  When eir_server_scope remains the same and
         eir_server_owner.so_major_id changes, the client can use
         filehandles it has and attempt reclaims.  It may find that
         these are now stale but if NFS4ERR_STALE is not received, he
         can proceed to reclaim his opens.

      o  When eir_server_scope and eir_server_owner.so_major_id remain
         the same, the client has to use the now-current values of
         eir_server-owner.so_minor_id in deciding on appropriate forms
         of trunking.

4.1.4.  Addressing Confirmation Status of Migrated Client IDs in NFSv4.1

   When a client ID is transferred between systems as a part of
   migration, it is not always clear whether it should be considered
   confirmed or unconfirmed on the target server.  In the case in which
   an associated session is transferred together with the client ID, it
   is clear that the transferred client ID needs to be considered
   confirmed, as the existence of an associated session is incompatible
   with an unconfirmed client ID.

   The case in which a client ID is transferred without an associated
   session is less clear-cut and there needs to be a choice between two
   possibilities:

   o  Consider it unconfirmed, because of the lack of an associated
      session.  This makes it simpler for the client to determine
      whether there is an associated session transferred at the same

Noveck, et al.         Expires September 30, 2017               [Page 8]
Internet-Draft              nfsv4-migr-issues                 March 2017

      time.  However, it is inconsistent with the fact there are
      stateids which have been transferred with the client ID.

   o  Consider it confirmed, because it was confirmed on the source
      server and the transfer is not considered to have affected that.
      Although this makes it simpler for the client to determine whether
      there is an associated session transferred at the same time, an
      alternative is discussed in Section 4.1.5.

   A related issue concerns the potential use the SEQ4_STATUS flags to
   determine whether all or some of the state present on the source has
   been transferred the destination server.  This could be done using
   either of the alternatives above but it is more in the spirit of the
   second alternative.  One potential use of these flags is discussed in
   more detail in Section 4.2.2.

4.1.5.  Addressing Session Migration in NFSv4.1

   Some issues that need to be addressed regard the migration of
   sessions, in addition to client IDs and stateids

   o  It needs to be made clearer how the client can deal with the
      possibility that sessions might or might not be transferred as
      part of Transparent State Migration.

   o  Rules need to be clarified regarding possible transfer of sessions
      when either the source session is being used to access other file
      systems on source server or there is already a session connecting
      the client to the destination server.

   o  There needs to be more detail regarding how the protocol avoids
      situations in which the same session is subject to concurrent
      changes on two different servers at the same time.

4.2.  Possible Resolutions for NFSv4.1 Issues

   The subsections below explore some ways of dealing with the issues
   discussed in Section 4.1

   First we introduce some terminology we will be using in these
   sections:

   o  Location attributes include the fs_locations and fs_locations_info
      attributes.

   o  Location entries are the individual file system locations in the
      location attributes.

Noveck, et al.         Expires September 30, 2017               [Page 9]
Internet-Draft              nfsv4-migr-issues                 March 2017

   o  Location elements are derived from location entries.  If a
      location entry specifies an IP address there is only a single
      corresponding location element.  Location entries that contain a
      host name, are resolved using DNS, and may result in one or more
      location elements.  All location elements consist of a location
      address which is the IP address of an interface to a server and an
      fs name which is the location of the file system within the
      server's pseudo-fs.  The fs name is empty if the server has no
      pseudo-fs and only a single exported file system at the root
      filehandle.

   o  Two location elements are trunkable if they specify the same fs
      name and the location addresses are such that trunking of the
      location addresses can be used as shown by the server_owner values
      returned.

4.2.1.  Server Responsibilities in Effecting Transparent State Migration

   The basic responsibility of the source server in effecting
   Transparent State Migration is to make available to the destination
   server a description of each piece of locking state associated with
   the file system being migrated.  In addition to client id string and
   verifier, the source server needs to provide.  for each stateid:

   o  The stateid including the current sequence value.

   o  The associated client ID.

   o  The handle of the associated file.

   o  The type of the lock, such as open, byte-range lock, delegation,
      layout.

   o  For locks such as opens and byte-range locks, there will be
      information about the owner(s) of the lock.

   o  For recallable/revocable lock types, the current recall status
      needs to be included.

   o  For each lock type there will by type-specific information, such
      as share and deny modes for opens and type and byte ranges for
      byte-range locks and layouts.

   A further server responsibility concerns locks that are revoked or
   otherwise lost during the process of file system migration.  Because
   locks that appear to be lost during the process of migration will be
   reclaimed by the client, the servers have to take steps to ensure
   that locks revoked soon before or soon after migration are not

Noveck, et al.         Expires September 30, 2017              [Page 10]
Internet-Draft              nfsv4-migr-issues                 March 2017

   inadvertently allowed to be reclaimed in situations in which the
   continuity of lock possession cannot be assured.

   o  For locks lost on the source but whose loss has not yet been
      acknowledged by the client (by using FREE_STATEID), the
      destination must be aware of this loss so that it can deny a
      request to reclaim them.

   o  For locks lost on the destination after the state transfer but
      before the client's RECLAIM_COMPLTE is done, the destination
      server should note these and not allow them to be reclaimed.

   A further responsibility of the servers concerns situations in which
   stateid cannot be transferred transparently because it conflicts with
   an existing stateid held by the client and associated with a
   different file systems.  In this case there are two valid choices:

   o  Treat the transfer, as in NFSv4.0, as one without Transparent
      State Migration.  In this case, conflicting locks cannot be
      granted until the client does a RECLAIM_COMPLETE, after reclaiming
      the lock it had, with the exception of reclaims denied because
      they were attempts to reclaim locks that had been lost.

   o  Implement Transparent State Migration, except for the lock with
      the conflicting stateid.  In this case, the client will be aware
      of a lost lock (through the SEQ4_STATUS flags) and be allowed to
      reclaim it.

4.2.2.  Determining Initial Migration Status in NFSv4.1

   This section proposes a way in which a client which receives
   NFS4ERR_MOVED can determine:

   o  Whether the NFS4ERR_MOVED indicates migration has occurred, or
      whether it indicates another sort of file system transition as
      discussed in Section 4.2.4

   o  In the case of migration, whether Transparent State Migration has
      occurred.

   o  Whether any state has been lost during the process of Transparent
      State Migration.

   o  Whether sessions have been transferred as part of Transparent
      State Migration.

   This is written assuming that the second option regarding client ID
   confirmation status after migration (as discussed in Section 4.1.4)

Noveck, et al.         Expires September 30, 2017              [Page 11]
Internet-Draft              nfsv4-migr-issues                 March 2017

   is adopted.  However that choice is not essential to the procedure
   and could be changed.

   The process begins by the client examining the location entries using
   either of the location attributes.  For those whose fs name matches
   that currently being used, an EXCHANGE_ID is directed at the location
   address and the server_owner and scope used to determine if the entry
   is trunkable with that previously being used to access the file
   system (i.e. that it represents another path to the same file system
   and can share locking state with it).  If it is, then this should be
   treated as a transition from one set of paths to another, as
   described in Section 4.2.4, rather than a migration event.

   Otherwuse, if one or more of the EXCHANGE_ID operations above has
   encountered a distinct server, then migration has occurred and the
   procedure continues.  If there were no location entries with a
   matching fs name, then one with another fs name is selected, an
   EXCHANGE_ID is done, and the procedure continues using the result of
   that operation.

   The determination of whether Transparent State Migration has occurred
   is driven by the client ID returned and its confirmation status.

   o  If the client ID is an unconfirmed client ID not previously known
      to the client, then Transparent State Migration has not occurred.

   o  If the client ID is a confirmed client ID previously known to the
      client, then any transferred state would have been merged with an
      existing client ID representing the client to the destination
      server.  In this state merger case, Transparent State Migration
      might or might not have occurred.

   o  If the client ID is a confirmed client ID not previously known to
      the client, then the client can conclude that the client ID was
      transferred as part of Transparent State Migration.  In this
      transferred client ID case, Transparent State Migration has
      occurred although some state may have been lost.

   In the state merger case, it is possible that the server has not
   attempted Transparent State Migration, in which case state may have
   been lost without it being reflected in the SEQ4_STATUS bits.  To
   determine whether this has happened, the client can use TEST_STATEID
   to check whether the stateids created on the source server are still
   accessible on the destination server.  Once a single stateid is found
   to have been successfully transferred, the client can conclude that
   Transparent State Migration was begun and any failure to transport
   all of the stateids will be reflected in the SEQ4_STATUS bits.

Noveck, et al.         Expires September 30, 2017              [Page 12]
Internet-Draft              nfsv4-migr-issues                 March 2017

   In any of the cases in which Transparent State Migration has
   occurred, it is possible that a session was transferred as well.  To
   deal with that possibility, clients can, after doing the EXCHANGE_ID,
   issue a BIND_CONN_TO_SESSION to connect the transferred session to a
   connection to the new server.  If that fails, it is an indication
   that the session was not transferred and that a new session needs to
   be created to take its place.

4.2.3.  Client Response to Migration in NFSv4.1

   Once the client has determined the initial migration status, it needs
   to re-establish its lock state, if possible.  To enable this to
   happen without loss of the guarantees normally provided by locking,
   the destination server needs to implement a per-fs grace period in
   all cases in which lock state was lost, including those in which
   Transparent State Migration was not implemented.

   The following cases need to be dealt with:

   o  In a case in which Transparent State Migration has not occurred,
      the client can use the per-fs grace period provided by the
      destination server to reclaim locks that were held on the source
      server.

   o  In a cases in which Transparent State Migration has occurred, and
      no lock state was lost (as shown by SEQ4_STATUS flags), no lock
      reclaim is necessary.

   o  In a case in which Transparent State Migration has occurred, and
      some lock state was lost (as shown by SEQ4_STATUS flags), existing
      stateids need to be checked for validity using TEST_STATEID, and
      reclaim used to re-establish any that were not transferred.

   For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value
   of true should be done before normal use of the file system including
   obtaining new locks for the file system.  This applies even if no
   locks were lost and needed to be reclaimed.

4.2.4.  Dealing with Multiple Location Entries

   The possibility that more than one server address may be present in
   location attributes requires further clarification.  This is
   particularly the case, given the potential role of trunking for
   NFSv4.1, whose connection to migration needs to be clarified.

   The description of the location attributes in [RFC5661], while it
   indicates that multiple address entries in these attributes may be
   used to indicate alternate paths to the file system, does so mainly

Noveck, et al.         Expires September 30, 2017              [Page 13]
Internet-Draft              nfsv4-migr-issues                 March 2017

   in the context of replication and does so without mentioning
   trunking.  The discussion of migration does not discuss the
   possibility of multiple location entries or trunking, which we will
   explore here.

   We will cover cases in which multiple addresses appear directly in
   the attributes as well as those in which the multiple addresses
   result because a single location entry is expanded into multiple
   location elements using addresses provided by DNS.

   When the set of valid location elements by which a file system may be
   accessed changes, migration need not be involved.  Some cases to
   consider:

   o  When the set of location elements expands, migration is not
      involved.  In the case in which the additional elements are not
      trunkable with ones previously being used, the new elements serve
      as additional access locations, available in case of the failure
      of server addresses being used.  When additional elements are
      trunkable with those currently being used the client may use the
      additional addresses just as they might have if they had been
      available when use of the file system began.

      There is no current mechanism by which the client can be notified
      of a change in the set of available location for an fs.  Given the
      client has at least one IP address available to access the
      filesystem in question, periodic polling is an adequate mechanism
      for the client to find additional server addresses to use to
      access the file system.

   o  When the set of location elements contracts but none of the
      elements no longer usable were in fact being used by the client,
      then no migration is involved.  Only if the client were to start
      using one of the unavailable elements will the client be notified
      (via NFS4ERR_MOVED) of the need to not use those elements and to
      use others provided by a location attribute.

   When a specific server address being used becomes unavailable to
   service a particular file system, NF4ERR_MOVED will be returned, and
   the client will respond based on the available locations.  Whether
   continuity of locking state will be available depends on a number of
   factors:

   o  If there are still elements in use trunkable with the element that
      has become unavailable, there will still be a continuity of
      locking state, even though Transparent State Migration per se has
      not occurred.  If the in-use addresses are session-trunkable with
      the address becoming unavailable, only one connection is lost and

Noveck, et al.         Expires September 30, 2017              [Page 14]
Internet-Draft              nfsv4-migr-issues                 March 2017

      all existing sessions will remain available.  If, on the other
      hand, the in-use addresses are only clientid-trunkable with the
      address becoming unavailable, a session can be lost.  However,
      that session can be made available on those other nodes, just as
      they it would have been if Transparent State Migration were in
      effect, even though no migration has occurred.

   o  Otherwise, if there are available addresses trunkable with the one
      that has become unavailable, the client has access to existing
      locking state once it establishes a connection with the new
      addresses, using a new or existing session depending on the type
      of trunking in effect.  This is also similar to the case in which
      Transparent State Migration has occurred, even though there is no
      migration, with the state remaining on the existing server.

      Note that this case, as well as the previous one, can be expected
      in the case in which the server seeks to direct traffic with
      regard to particular file systems to choose addresses, in the
      interest of load balancing, to adjust to hardware availability
      constraints, or for other reasons.

   o  In other cases, migration has occurred and the client can use the
      procedure described in Section 4.2.2 to determine whether
      Transparent State Migration occurred and whether any locking state
      was lost during the transfer.

   One should note the following differences between migration with
   Transparent State Migration and the similar cases in which there is a
   continuity of locking state with no change in the server.

   o  When locks are lost (as indicated when using them or via the
      SEQ4_STAUS flags) and migration has not been done, they are not to
      be reclaimed.  Instead such losses are treated as lock revocations
      and acknowledged using FREE_STATEID.

   o  When migration has not been done, there is no need for a
      RECLAIM_COMPLETE (with rca_one_fs set to true).

4.2.5.  Client Recovery from Migration Events

   When a file system is migrated, there a number of migration-related
   status indications with which clients need to deal:

   o  If an attempt is made to use or return a filehandle within a file
      system that has been migrated away from the server on which it was
      previously available, the error NFS4ERR_MOVED is returned.

Noveck, et al.         Expires September 30, 2017              [Page 15]
Internet-Draft              nfsv4-migr-issues                 March 2017

      This condition continues on subsequent attempts to access the file
      system in question.  The only way the client can avoid the error
      is to cease accessing the filesystem in question at its old server
      location and access it instead on the server to which it has been
      migrated.

   o  Whenever a SEQUENCE operation is sent by a client to a server
      which generated state held on that client which is associated with
      a file system that has been migrated away from the server on which
      it was previously available, the status bit
      SEQ4_STATUS_LEASE_MOVED is set in the response.

      This condition continues until the client acknowledges the
      notification by fetching a location attribute for the migrated
      file system.  When there are multiple migrated file systems, a
      location attribute for each such migrated file system needs to be
      fetched, in order to clear the condition.  Even after the
      condition is cleared, the client needs to respond by using the
      location information to access the destination server to ensure
      that leases are not needlessly expired.

   Unlike the case of NFSv4.0 in which the corresponding conditions are
   both errors, in NFSv4.1 the client can, and often will, receive both
   indications on the same request.  As a result, the question of how to
   co-ordinate the necessary recovery actions when both indications
   arrive simultaneously must be resolved.  It should be noted that when
   the server decides whether SEQ4_STATUS_LEASE_MOVED is ti be set, it
   has no way of knowing which file system will be referenced or whether
   NFS4ERR_MOVED will be returned.

   While it is true that, when only a single migrated file system is
   involved, a single set of actions will clear both indications, the
   possibility of multiple migrated file systems calls for an approach
   in which there are separate recovery actions for each indication.  In
   general, the response to neither indication can be subsumed within
   the other since:

   o  If the client were to respond only to the MOVED indication, there
      would be no effective client response to a situation in which a
      file system was not being actively accessed at the time migration
      occurred.  As a result, leases on the destination server might be
      needlessly expired.

   o  If the client were to respond only to the LEASE_MOVED indication,
      recovery for migrated file systems in active use could be deferred
      in order to accomplish recovery for others not being actively
      accessed.  The consequences of this choice can pose particular
      problems when there are a large number of file systems supported

Noveck, et al.         Expires September 30, 2017              [Page 16]
Internet-Draft              nfsv4-migr-issues                 March 2017

      by a particular server, or when it happens that some servers,
      after receiving migrated file systems have periods of
      unavailability, such as occur as a result of server reboot.  This
      can result in recovery for actively accessed migrated file systems
      being unnecessarily delayed for long periods of time.

   Similar considerations apply to other arrangements in which one of
   the indications, while not ignored per se, is subsumed within a
   single recovery process focused on recovery for the other indication.

   Generally speaking, client recovery for these indications should have
   the following characteristics:

   o  All instances of the MOVED indication should be dealt with
      promptly, either by doing the necessary recovery directly,
      providing that it be done asynchronously, or ensuring that it is
      already under way.

   o  All instances of the LEASE_MOVED indication should be dealt with
      asynchronously, in a migration discovery thread whose job is to
      clear that indication by fetching the appropriate location
      attribute.  Because this thread will only be fetching a location
      attribute and the fs_status attribute for the file systems
      referenced by the client, it cannot receive MOVED indications.
      Some useful guidance regarding possible implementation of the
      migration discovery thread can be found in Section 4.2.6.

   o  When a migration discovery thread happens upon a migrated file
      system (i.e. not present and not a referral), the thread is likely
      to have cleared one (out of an unknown number) of file systems
      whose migration needs to be responded to.  The discovery thread
      needs to schedule the appropriate migration recovery (as described
      in Section 4.2.3).  This is necessary to ensure that migrated file
      systems will be referenced on the destination server in order to
      avoid lease expiration

      For many of the migrated file systems discovered in this way, the
      client has not received any MOVED indication.  In such cases,
      lease recovery needs to be scheduled but it should not interfere
      with continuation of the migration discovery function.

   o  When a migration discovery thread receives a LEASE_MOVED
      indication, it takes no special action but continues its normal
      operation.  On the other hand, if a LEASE_MOVED indication is not
      received, it indicates that the thread has completed its work
      successfully.

Noveck, et al.         Expires September 30, 2017              [Page 17]
Internet-Draft              nfsv4-migr-issues                 March 2017

4.2.6.  The Migration Discovery Process

   As noted above, LEASE_MOVED indications are best dealt with in a
   migration discovery thread.  Because of this structure,

   o  No action needs to be taken for such indications received by the
      migration discovery threads, since continuation of that thread's
      work will address the issue.

   o  For such indications received in other contexts, the generally
      appropriate response is to initiate or otherwise provide for the
      execution of a migration discovery thread for file systems
      associated with the server IP address returning the indication.

   o  In all cases in which the appropriate migration discovery thread
      is running, nothing further need be done to respond to LEASE_MOVED
      indications.

   This leaves a potential difficulty in situations in which the
   migration discovery thread is near to completion but is still
   operating.  One should not ignore a LEASE_MOVED indication if the
   discovery thread is not able to respond to migrated file system
   without additional aid.  A further difficulty in addressing such
   situation is that a LEASE_MOVED indication may reflect the server's
   state at the time the SEQUENCE operation was processed, which may be
   different from that in effect at the time the response is received.

   A useful approach to this issue involves the use of separate
   externally-visible discovery thread states representing non-
   operation, normal operation, and completion/verification of migration
   discovery processing.

   Within that framework, discovery thread processing would proceed as
   follows.

   o  While in the normal-operation state, the thread would fetch, for
      successive file systems known to the client on the server being
      worked on, a location attribute plus the fs_status attribute.

   o  If the fs_status attribute indicates that the file system is a
      migrated one (i.e. fss_absent is true and fss_type !=
      STATUS4_REFERRAL) and thus that it is likely that the fetch of the
      location attribute has cleared one the file systems contributing
      to the LEASE_MOVED indication.

   o  In cases in which that happened, the thread cannot know whether
      the LEASE_MOVED indication has been cleared and so it enters the

Noveck, et al.         Expires September 30, 2017              [Page 18]
Internet-Draft              nfsv4-migr-issues                 March 2017

      completion/verification state and proceeds to issue a COMPOUND to
      see if the LEASE_MOVED indication has been cleared.

   o  When the discovery thread is in the completion/verification state,
      if others get a LEASE_MOVED indication they note this fact and it
      is used when the request completes, as described below.

   When the request used in the completion/verification state completes:

   o  If a LEASE_MOVED indication is returned, the discovery thread
      resumes its normal work.

   o  Otherwise, if there is any record that other requests saw a
      LEASE_MOVED indication, that record is cleared and the
      verification request retried.  The discovery thread remains in
      completion/verification state.

   o  If there has been no LEASE_MOVED indication, the work of the
      discovery thread is considered completed and it enters the non-
      operating state.

4.2.7.  Synchronzing Session Transfer

   When transferring state between the source and destination, the
   issues discussed in Section 7.2 of [RFC7931] must still be attended
   to.  In this case, the use of NFS4ERR_DELAY is still necessary in
   NFSv4.1, as it was in NFSv4.0, to prevent locking state changing
   while it is being transferred.

   There are a number of important differences in the NFS4.1 context:

   o  The absence of RELEASE_LOCKOWNER means that the one case in which
      an operation could not be deferred by use of NFS4ERR_DELAY no
      longer exists.

   o  Sequencing of operations is no longer done using owner-based
      operation sequences numbers.  Instead, sequencing is session-
      based

   As a result, when sessions are not transferred, the techniques
   discussed in [RFC7931] are adequate and will not be further
   discussed.

   When sessions are transferred, there are a number of issues that pose
   challenges since,

   o  A single session may be used to access multiple file systems, not
      all of which are being transferred.

Noveck, et al.         Expires September 30, 2017              [Page 19]
Internet-Draft              nfsv4-migr-issues                 March 2017

   o  Requests made on a session, even if rejected may, affect the state
      of the session by advancing the sequence number associated with
      the slot used.

   As a result, when the filesystem state might otherwise be considered
   unmodifiable, the client might have any number of in-flight requests,
   each of which is capable of changing session state, which may be of a
   number of types:

   1.  Those requests that were processed on the migrating file system,
       before migration began.

   2.  Those requests which got the error NFS4ERR_DELAY because the file
       system being accessed was in the process of being migrated.

   3.  Those requests which got the error NFS4ERR_MOVED because the file
       system being accessed had been migrated.

   4.  Those requests that accessed the migrating file system, in order
       to obtain location or status information.

   5.  Those requests that did not reference the migrating file system.

   It should be noted that the history of any particular slot is likely
   to include a number of these request classes.  In the case in which a
   session which is migrated is used by filesystems other than the one
   migrated, requests of class 5 may be common and be the last request
   processed, for many slots.

   Since session state can change even after the locking state has been
   fixed as part of the migration process, the session state known to
   the client could be different from that on the destination server,
   which necessarily reflects the session state on the source server, at
   an earlier time.  In deciding how to deal with this situation, it is
   helpful to distinguish between two sorts of behavioral consequences
   of the choice of initial sequence ID values.

   o  The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID
      in a request is neither equal to the last one seen for the current
      slot nor the next greater one.

      In view of the difficulty of arriving at a mutually acceptable
      value for the correct last sequence a the point of migration, it
      may be necessary for the server to show some degree of
      forbearance, when the sequence ID is one that would be considered
      unacceptable if session migration were not involved.

Noveck, et al.         Expires September 30, 2017              [Page 20]
Internet-Draft              nfsv4-migr-issues                 March 2017

   o  Returning the cached reply for a previously executed request when
      the sequence ID in the request matches the last value recorded for
      the slot.

      In the cases in which an error is returned and there is no
      possibility of any non-idempotent operation having been executed,
      it may not be necessary to adhere to this as strictly as might be
      proper if session migration were not involved.  For example, the
      fact that the error NFS4ERR_DELAY was returned may not assist the
      client in any material way, while the fact that NFS4ERR_MOVED was
      returned by the source server may not be relevant when the request
      was reissued, directed to the destination server.

   One part of adapting to these sorts of issues would restrict
   enforcement of normal slot sequence enforcement semantics until the
   client itself, by issuing a request using a particular slot on the
   destination server, established the new starting sequence for that
   slot on the migrated session.

   An important issue is that the specification needs to take note of
   all potential COMPOUNDs, even if they might be unlikely in practice.
   For example, a COMPOUND is allow to access multiple file systems and
   might perform non-idempotent operations in some of them before
   accessing a file system being migrated.  Also, a COMPOUND may return
   considerable data in the response, before being rejected with
   NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as
   sa_cachethis.

   Some possibilities that need to be considered to address the issues:

   o  Do not enforce any sequencing semantics for a particular slot
      until the client has established the starting sequence for that
      slot on the destination server.

   o  For each slot, do not return a cached reply returning
      NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established
      the starting sequence for that slot on the destination server.

   o  Until the client has established the starting sequence for a
      particular slot on the destination server, do not report
      NFS4ERR_SEQ_MISORDERED or return a cached reply returning
      NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of
      a series of operations where the response is NFS4_OK until the
      final error.

Noveck, et al.         Expires September 30, 2017              [Page 21]
Internet-Draft              nfsv4-migr-issues                 March 2017

4.2.8.  Migration and pNFS

   When pNFS is involved, migration is capable of supporting:

   o  Migration of the MDS, leaving DS's in place.

   o  Migration of the file system as a whole, including the MDS and
      associated DS's.

   o  Replacement of one DS by another.

   o  Migration of a pNFS file system to one in which pNFS is not used.

   o  Migration of a file system not using pNFS to one in which layouts
      are available.

   Migration of the MDS function is directly supported by Transparent
   State Migration.  Layout state will normally be transparently
   transferred, just as other state is.  As a result, Transparent State
   Migration provides a framework in which, given appropriate inter-MDS
   data transfer, one MDS can be substituted for another.

   Migration of the file system function can be accomplished by
   recalling all layouts as part of the initial phase of the migration
   process.  As a result, IO will be done through the MDS during the
   migration process, and new layouts can be granted once the client is
   interacting with the new MDS.  An MDS can also effect this sort of
   transition by revoking all layouts as part of Transparent State
   Migration, as long as the client is notified about the loss of state.

   In order to allow migration to a file system on which pNFS is not
   supported, clients need to be prepared for a situation in layouts are
   not available or supported on the destination file system and be
   prepared to direct IO request to the destination server, rather than
   depending on layouts being available.

   Replacement of one DS by another is not addressed by migration as
   such but can be effected by an MDS recalling layouts for the DS to be
   replaced and issuing new ones to be served by the successor DS.

   Migration may transfer a file system from a server which does not
   support pNFS to one which does.  In order to properly adapt to this
   situation, clients which support pNFS, but function adequately in its
   absence, should check for pNFS support when a file system is migrated
   and be prepared to use pNFS when support is available.

Noveck, et al.         Expires September 30, 2017              [Page 22]
Internet-Draft              nfsv4-migr-issues                 March 2017

5.  Security Considerations

   With regard to NFSv4.0, the Security Considerations section of
   [RFC7530] encourages clients to protect the integrity of the SECINFO
   operation, any GETATTR operation for the fs_locations attribute.  A
   needed change is to include the operations SETCLIENTID/
   SETCLIENTID_CONFIRM as among those for which integrity protection is
   recommended.  A migration recovery event can use any or all of these
   operations.

   With regard to NFSv4.1, the Security Considerations section of
   [RFC5661] takes proper care of migration-related issues.  No change
   is needed.

6.  IANA Considerations

   This document does not require actions by IANA.

7.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <http://www.rfc-editor.org/info/rfc2119>.

   [RFC5661]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
              "Network File System (NFS) Version 4 Minor Version 1
              Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010,
              <http://www.rfc-editor.org/info/rfc5661>.

   [RFC7530]  Haynes, T., Ed. and D. Noveck, Ed., "Network File System
              (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
              March 2015, <http://www.rfc-editor.org/info/rfc7530>.

   [RFC7931]  Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker,
              "NFSv4.0 Migration: Specification Update", RFC 7931,
              DOI 10.17487/RFC7931, July 2016,
              <http://www.rfc-editor.org/info/rfc7931>.

Appendix A.  Acknowledgements

   The editor and authors of this document gratefully acknowledge the
   contributions of Trond Myklebust of NetApp and Robert Thurlow of
   Oracle.  We also thank Tom Haynes of Primary Data and Spencer Shepler
   of Microsoft for their guidance and suggestions.

   Special thanks go to members of the Oracle Solaris NFS team,
   especially Rick Mesta and James Wahlig, for their work implementing

Noveck, et al.         Expires September 30, 2017              [Page 23]
Internet-Draft              nfsv4-migr-issues                 March 2017

   an NFSv4.0 migration prototype and identifying many of the issues
   documented here.

Authors' Addresses

   David Noveck (editor)
   NetApp
   26 Locust Avenue
   Lexington, MA  02421
   US

   Phone: +1 781 572 8038
   Email: davenoveck@gmail.com

   Piyush Shivam
   Oracle Corporation
   5300 Riata Park Ct.
   Austin, TX  78727
   US

   Phone: +1 512 401 1019
   Email: piyush.shivam@oracle.com

   Charles Lever
   Oracle Corporation
   1015 Granger Avenue
   Ann Arbor, MI  48104
   US

   Phone: +1 248 614 5091
   Email: chuck.lever@oracle.com

   Bill Baker
   Oracle Corporation
   5300 Riata Park Ct.
   Austin, TX  78727
   US

   Phone: +1 512 401 1081
   Email: bill.baker@oracle.com

Noveck, et al.         Expires September 30, 2017              [Page 24]