Draft RSVP Reservation Aggregation June 1999
Aggregation of RSVP for IPv4 and IPv6 Reservations
draft-baker-rsvp-aggregation-01.txt
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC 2026. Internet Drafts
are working documents of the Internet Engineering Task Force
(IETF), its Areas, and its Working Groups. Note that other
groups may also distribute working documents as Internet
Drafts.
Internet Drafts are valid for a maximum of six months and may
be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet Drafts as reference
material or to cite them other than as a "work in progress".
Comments should be made to the authors and the rsvp@isi.edu
list.
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
A key problem in the design of RSVP version 1 is, as noted in
its applicability statement, that it lacks facilities for
aggregation of individual reserved sessions into a common
class. The use of such aggregation is required for
scalability.
This document describes the use of a single RSVP reservation
to aggregate other RSVP reservations across a transit routing
region, in a manner conceptually similar to the use of Virtual
Paths in an ATM network. It proposes a way to dynamically
create the aggregate reservation, classify the traffic for
which the aggregate reservation applies, determine how much
bandwidth is needed to achieve the requirement, and recover
the bandwidth when the sub-reservations are no longer
required. It also contains recommendations concerning
algorithms and policies for predictive reservations.
Baker et al. Expiration: December 1999 [Page 1]
Draft RSVP Reservation Aggregation June 1999
1. Introduction
A key problem in the design of RSVP version 1 [RSVP] is, as
noted in its applicability statement, that it lacks facilities
for aggregation of individual reserved sessions into a common
class. The use of such aggregation is recommended in [CSZ],
and required for scalability.
The problem of aggregation may be addressed in a variety of
ways. For example, it may sometimes be sufficient simply to
mark reserved traffic with a suitable DSCP (e.g. EF), thus
enabling aggregation of scheduling and classification state.
It may also be desirable to install one or more aggregate
reservations from ingress to egress of an "aggregation region"
(defined below) where each aggregate reservation carries
similarly marked packets from a large number of flows. This is
to provide high levels of assurance that the end-to-end
requirements of reserved flows will be met, while at the same
time enabling reservation state to be aggregated.
Throughout, we will talk about "Aggregator" and
"Deaggregator", referring to the routers at the ingress and
egress edges of an aggregation region. Exactly how a router
determines whether it should perform the role of aggregator or
deaggregator is described below.
We will refer to the individual reserved sessions (the
sessions we are attempting to aggregate) as "end-to-end"
reservations ("E2E" for short), and to their respective
Path/Resv messages as E2E Path/Resv messages. We refer to the
the larger reservation (that which represents many E2E
reservations) as an "aggregate" reservation, and its
respective Path/Resv messages as "aggregate Path/Resv
messages".
1.1. Problem Statement: Aggregation Of E2E Reservations
The problem of many small reservations has been extensively
discussed, and may be summarized in the observation that each
reservation requires a non-trivial amount of message exchange,
computation, and memory resources in each router along the
way. It would be nice to reduce this to a more manageable
level where the load is heaviest and aggregation is possible.
Aggregation, however, brings its own challenges. In
Baker et al. Expiration: December 1999 [Page 2]
Draft RSVP Reservation Aggregation June 1999
particular, it reduces the level of isolation between
individual flows, implying that one flow may suffer delay from
the bursts of another. Synchronization of bursts from
different flows may occur. However, there is evidence [CSZ] to
suggest that aggregation of flows has no negative effect on
the mean delay of the flows, and actually leads to a reduction
of delay in the "tail" of the delay distribution (e.g. 99%
percentile delay) for the flows. These benefits of aggregation
to some extent offset the loss of strict isolation.
1.2. Proposed Solution
The solution we propose involves the aggregation of several
E2E reservations that cross an "aggregation region" and share
common ingress and egress routers into one larger reservation
from ingress to egress. We define an "aggregation region" as a
contiguous set of systems capable of performing RSVP
aggregation (as defined following) along any possible route
through this contiguous set.
Communication interfaces fall into two categories with respect
to an aggregation region; they are "exterior" to an
aggregation region, or they are "interior" to it. Routers that
have at least one interface in the region fall into three
categories with respect to a given RSVP session; they
aggregate, they deaggregate, or they are between an aggregator
and a deaggregator.
Aggregation depends on being able to hide E2E RSVP messages
from routers inside the aggregation region. To achieve this
end, the IP Protocol Number in the E2E reservation's PATH,
PATH-TEAR, and RESV-CONFIRM messages is changed from RSVP (46)
to RSVP-E2E-IGNORE (a new value, to be assigned) upon entering
the aggregation region, and restored to RSVP at the
deaggregator point. These messages are ignored (no state is
stored and the message is forwarded as a normal IP datagram)
by each router within the aggregation region whenever they are
forwarded to an inside interface. Since the deaggregating
router perceives the previous hop on such messages to be the
aggregating router, RESV and other messages do not require
this modification; they are unicast from system to system
anyway.
The token buckets (SENDER_TSPECs and FLOWSPECS) of E2E
reservations are summed into the corresponding information
elements in aggregate PATH and RESV messages. Aggregate PATH
Baker et al. Expiration: December 1999 [Page 3]
Draft RSVP Reservation Aggregation June 1999
messages are sent from the aggregator to the deaggregator(s)
using RSVP's normal IP Protocol Number. Aggregate RESV
messages are sent back from the deaggregator to the
aggregator, thus establishing an aggregate reservation on
behalf of the set of E2E flows that use this aggregator and
deaggregator. There may be several such aggregate reservations
between the same two routers, representing different classes
of traffic; the aggregate reservation is therefore for the
traffic marked with a particular DSCP Group.
1.3. Definitions
We define an "aggregation region" as a set of RSVP-capable
routers for which E2E RSVP messages arriving on an outside
interface of one router in the set would traverse one or more
inside interfaces (of this and/or other routers in the set)
before finally traversing an outside interface.
Such an E2E RSVP message is said to have crossed the
aggregation region.
We define the "aggregating" router for this E2E flow as the
first router that processes the E2E Path message as it enters
the aggregation region (i.e., the one which forwards the
message from an outside interface to an inside interface).
We define the "deaggregating" router for this E2E flow as the
last router to process the E2E Path as it leaves the
aggregation region (i.e., the one which forwards the message
from an inside interface to an outside interface).
We define an "interior" router for this E2E flow as any router
in the aggregation region which receives this message on an
inside interface and forwards it to another inside interface.
Interior routers perform neither aggregation nor deaggregation
for this flow.
1.4. Detailed Aspects of Proposed Solution
A number of issues jump to mind in considering this model.
Baker et al. Expiration: December 1999 [Page 4]
Draft RSVP Reservation Aggregation June 1999
1.4.1. Traffic Classification Within The Aggregation region
One of the reasons that RSVP Version 1 did not identify a way
to aggregate sessions was that there was not a clear way to
classify the aggregate. With the development of the
Differentiated Services architecture, this is at least
partially resolved; traffic of a particular class can be
marked with a given DSCP and so classified. We presume this
model.
We presume that on each link en route, a queue, WDM color, or
similar management component is set aside for all aggregated
traffic of the same class, and that sufficient bandwidth is
made available to carry the traffic that has been assigned to
it. This bandwidth may be adjusted based on the total amount
of aggregated reservation traffic assigned to the same class.
There are numerous options for exactly which Diff-serv PHBs
might be used for different classes of traffic as it crosses
the aggregation region. This is the "service mapping" problem
described in [ISDS], and is applicable to situations broader
than those described in this document. Arguments can be made
for using either EF or one or more AF PHBs for aggregated
traffic.
Independent of which PHB is used, care needs to be take in an
environment where provisioned Diff-Serv and aggregated RSVP
are used in the same network, to ensure that the total offered
load for a single PHB does not exceed the link capacity
allocated to that PHB. One solution to this is to reserve one
of the four AF classes strictly for the aggregated reservation
traffic while using other AF classes for provisioned Diff-
Serv.
Therefore, while some RSVP reservation state per aggregate
reservation is maintained inside the aggregation region, a
single classification and scheduling state (e.g., a DSCP used
for classifying traffic) is maintained per aggregate
reservation class (rather than per aggregate reservation)
inside the aggregation region. For example, if Guaranteed
Service is represented by the EF DSCP throughout the
aggregation region, there may be a reservation for each
aggregator/deaggregator pair in each router, but only the EF
DSCP need be inspected at each inside interface.
Baker et al. Expiration: December 1999 [Page 5]
Draft RSVP Reservation Aggregation June 1999
1.4.2. Deaggregator Determination
The first question is "How do we know which aggregate
reservation a particular E2E flow should aggregate into?" To
know that, we must know three things: its aggregating router,
its deaggregating router, and (assuming DSCPs are used to
differentiate among various reservations between the same two
routers), the relevant DSCP.
The aggregator is trivial: we know that an E2E flow has
arrived at an aggregator when its PATH message arrives at a
router on an outside interface and must be forwarded on an
inside interface.
The DSCP is equally easy, or at least it is in concept. The
DSCP is chosen for an aggregate reservation based on locally
configured policy, perhaps taking into account such factors as
the intserv service class requested for the flow. (Some
details in the exact point at which the DSCP can be determined
are discussed below.)
The deaggregator is more involved. If an SPF routing protocol,
such as OSPF or IS-IS, is in use, and if it has been extended
to advertise information on Deaggregation roles, it can tell
us the set of routers from which the deaggregator will be
chosen. In principle, if the aggregator and deaggregator are
in the same area, then the identity of the deaggregator could
be determined from the link state database. However, this
approach would not work in multi-area environments or for
distance vector protocols.
One method for Deaggregator determination is manual
configuration. With this method the network operator would
configure the Aggregator and the Deaggregator with the
necessary information.
Another method allows automatic Deaggregator determination and
corresponding Aggregator notification. When the E2E RSVP PATH
message transits from an inside interface to an outside
interface, the deaggregating router must advise the
aggregating router of the correlation between itself and the
flow. This has the nice attribute of not being specific to the
routing protocol. It also has the property of automatically
adjusting to route changes. For instance, if because of a
topology change, another Deaggregator is now on the shortest
path, this method will automatically identify the new
Deaggregator and swap to it.
Baker et al. Expiration: December 1999 [Page 6]
Draft RSVP Reservation Aggregation June 1999
1.4.3. Size of Aggregate Reservations
A range of options exist for determining the size of the
aggregate reservation, presenting a tradeoff between
simplicity and scalability. Simplistically, the size of the
aggregate reservation needs to be greater than or equal to the
sum of the bandwidth of the E2E reservations it aggregates,
and its burst capacity must be greater than or equal to the
sum of their burst capacities. However, if followed
religiously, this leads us to change the bandwidth of the
aggregate reservation each time an underlying E2E reservation
changes, which loses one of the key benefits of aggregation,
the reduction of message processing cost in the aggregation
region.
We assume, therefore, that there is some policy, not defined
in this specification (although sample policies are suggested
which have the necessary characteristics). This policy
maintains the amount of bandwidth on a given aggregate
reservation at an amount greater than or equal to the sum of
the bandwidths of its underlying E2E reservations, while
endeavoring to change it infrequently. This may require some
level of trend analysis. If there is a significant probability
that in the next interval of time the current aggregate
reservation will be exhausted, the router must predict the
necessary bandwidth and request it. If the router has a
significant amount of bandwidth reserved but has very little
probability of using it, the policy may be to predict the
amount of bandwidth required and release the excess.
This policy is likely to benefit from introduction of some
hysteresis (i.e. ensure that the trigger condition for
aggregate reservation size increase is sufficiently different
from the trigger condition for aggregate reservation size
decrease) to avoid oscillation in stable conditions.
Clearly, the definition and operation of such policies are as
much business issues as they are technical, and are out of the
scope of this document.
1.4.4. Intra-domain Routes
RSVP directly handles route changes, in that reservations
follow the routes that their data follow. This follows from
the property that PATH messages contain the same IP source and
destination address as the data flow for which a reservation
Baker et al. Expiration: December 1999 [Page 7]
Draft RSVP Reservation Aggregation June 1999
is to be established. However, since we are now making
aggregate reservations by sending a PATH message from an
aggregating to a deaggregating router, the reserved (E2E) data
packets no longer carry the same IP addresses as the relevant
(aggregate) PATH message. The issue becomes one of making sure
that data packets for reserved flows follow the same path as
the PATH message that established Path state for the aggregate
reservation. Several approaches are viable.
First, the data may be tunneled from aggregator to
deaggregator, using technologies such as IP-in-IP tunnels, GRE
tunnels, MPLS labeled tunnels, and so on. These each have
particular advantages, especially MPLS, which allows traffic
engineering. They each also have some cost in link overhead
and configuration complexity.
If data is not tunneled, then we are depending a
characteristic of IP best metric routing , which is that if
the route from A to Z includes the path from H to L, and the
best metric route was chosen all along the way, then the best
metric route was chosen from H to L. Therefore, an aggregate
path message which crosses a given aggregator and deaggregator
will of necessity use the best path between them.
If this is a single path, the problem is solved. If it is a
multi-path route, then we are forced to determine, perhaps by
measurement, what proportion of the traffic for a given E2E
reservation is passing along each of the paths, and assure
ourselves of sufficient bandwidth for the present use. A
simple, though inelegant, way of doing this is to reserve the
total capacity of the aggregate route down each path.
For this reason, we believe it is advantageous to use one of
the above-mentioned tunneling mechanisms in cases where
multi-path routes may exist.
1.4.5. Inter-domain Routes
The case of inter-domain routes differs somewhat from the
intra-domain case just described. Specifically, best-path
considerations do not apply, as routing is by a combination of
routing policy and shortest AS path rather than simple best
metric.
In the case of inter-domain routes, data traffic belonging to
different E2E sessions (but the same aggregate session) may
Baker et al. Expiration: December 1999 [Page 8]
Draft RSVP Reservation Aggregation June 1999
not enter an aggregation region via the same aggregator
interface, and/or may not leave via the same deaggregator
interface. It is possible that we could identify this
occurrence in some central system which sees the reservation
information for both of the apparent sessions, but it is not
clear that we could determine a priori how much traffic went
one way or the other apart from measurement.
We simply note that this problem can occur and needs to be
allowed for in the implementation. We recommend that each such
e2e reservation be summed into its appropriate aggregate
reservation, even though this involves over-reservation.
1.4.6. Reservations for Multicast Sessions
Aggregating reservations for multicast sessions is
significantly more complex than for unicast sessions. The
first challenge is to construct a multicast tree for
distribution of the aggregate Path messages which follows the
same path as will be followed by the data packets for which
the aggregate reservation is to be made. This is complicated
by the fact that the path which is followed by a data packet
may depend on many factors such as its source address, the
choice of shared trees or source-specific trees, and the
location of a rendezvous point for the tree.
Once the problem of distributing aggregate Path messages is
solved, there are considerable problems in determining the
correct amount of resources to reserve at each link along the
multicast tree. Because of the amount of heterogeneity that
may exist in an aggregate multicast reservation, it appears
that it would be necessary to retain information about
individual E2E reservations within the aggregation region to
allocate resources correctly. Thus, we may end up with a
complex set of procedures for forming aggregate reservations
that do not actually reduce the amount of stored state
significantly for multicast sessions. [BERSON] describes
possible ways to reduce this state by using measurement-based
admission control.
As noted above, there are several aspects to RSVP state, and
our approach for unicast aggregates all forms of state:
classification, scheduling, and reservation state. One
possible approach to multicast is to focus only on aggregation
of classification and scheduling state, which are arguably the
most important because of their impact on the fast path. That
Baker et al. Expiration: December 1999 [Page 9]
Draft RSVP Reservation Aggregation June 1999
approach is the one described in the current draft.
1.4.7. Multi-level aggregation
Ideally, an aggregation scheme should be able to accommodate
recursive aggregation, with aggregate reservations being
themselves aggregated. Multi-level aggregation can be
accomplished using the procedures described here and a simple
extension to the protocol number swapping process.
We can consider E2E RSVP reservations to be at aggregation
level 0. When we aggregate these reservations, we produce
reservations at aggregation level 1. In general, level n
reservations may be aggregated to form reservations at level
n+1.
When an aggregating router receives an E2E Path, it swaps the
protocol number from RSVP to RSVP-E2E-IGNORE. In addition, it
should write the aggregation level (1, in this case) in the 2
byte field that is present (and currently unused) in the
router alert option. In general, a router which aggregates
reservations at level n to create reservations at level n+1
will write the number n+1 in the router alert field. A router
which deaggregates level n+1 reservations will examine all
messages with IP protocol number RSVP-E2E-IGNORE but will
process the message and swap the protocol number back to RSVP
only in the case where the router alert field carries the
number n+1. For any other value, the message is forwarded
unchanged. Interior routers ignore all messages with IP
protocol number RSVP-E2E-IGNORE.
1.4.8. Reliability Issues
There are a variety of issues that arise in the context of
aggregation that would benefit from some form of explicit
acknowledgment mechanism for RSVP messages. For example, it
is possible to configure a set of routers such that an E2E
PATH of protocol type RSVP-E2E-IGNORE would be effectively
"black-holed", if it never reached a router which was
appropriately configured to act as a deaggregator. It could
then travel all the way to its destination where it would
probably be ignored due to its non-standard protocol number.
This situation is not easy to detect. The aggregator can be
sure this problem has not occurred if a Path Error message is
received from the deaggregator (as described in detail below).
Baker et al. Expiration: December 1999 [Page 10]
Draft RSVP Reservation Aggregation June 1999
It can also be sure there is no problem if an E2E Resv is
received. However, the fact that neither of these events has
happened may only mean that no receiver wishes to reserve
resources for this session, or it may mean that the PATH was
black-holed. However, if a neighbor-to-neighbor acknowledgment
mechanism existed, the aggregator would expect to receive an
acknowledgment from the deaggregator, and would interpret the
lack of a response as an indication that a problem of
configuration existed. It could then refrain from aggregating
this particular session. We note that such a reliability
mechanism has been proposed for RSVP in [REFRESH] and propose
that it be used here.
Baker et al. Expiration: December 1999 [Page 11]
Draft RSVP Reservation Aggregation June 1999
2. Elements of Procedure
To implement aggregation, we define a number of elements of
procedure.
2.1. Receipt of E2E Path Message By aggregating router
The very first event is the arrival of the E2E PATH message at
an outside interface of an aggregator. Standard RSVP
procedures [RSVP] are followed for this, including
consideration of what set of interfaces it needs to be
forwarded onto. These interfaces comprise zero or more outside
interfaces and zero or more inside interfaces.
Service on outside interfaces is handled as defined in [RSVP].
Service on inside interfaces is complicated by the fact that
the message needs to be included in some number of aggregate
reservations, but at this point it is not known which one,
because the deaggregator is not known. Therefore, the E2E PATH
message is forwarded on the inside interface(s) using the IP
Protocol number RSVP-E2E-IGNORE, but in every other respect
identically to the way it would be sent by an RSVP router that
was not performing aggregation.
2.2. Handling Of E2E Path Message By Interior Routers
At this point, the e2e Path message traverses zero or more
interior routers. Interior routers receive the e2e Path
message on an inside interface and forward it on another
inside interface. The Router Alert IP Option alerts interior
routers to check internally, but they find that the IP
Protocol is RSVP-E2E-IGNORE and the next hop interface is
inside. As such, they simply forward it as a normal IP
datagram.
2.3. Receipt of E2E Path Message By Deaggregating router
The E2E PATH message finally arrives at a deaggregating
router, which receives it on an inside interface and forwards
it on an outside interface. Again, the Router Alert IP Option
alerts it to intercept the message, but this time the IP
Protocol is RSVP-E2E-IGNORE and the next hop interface is an
Baker et al. Expiration: December 1999 [Page 12]
Draft RSVP Reservation Aggregation June 1999
outside interface.
At this point, the deaggregating router associates the flow
with an aggregate reservation. This selection is done on the
basis of policy, and may take into account not only the
aggregating router (whose IP Address may be found in the RSVP
Hop Object) but other information about the flow. If no such
aggregate reservation exists and the router is so configured,
it may generate a PATH ERROR with code NEW-AGGREGATE-NEEDED
back to the aggregating router. This should not result in any
reservation being taken down, but may result in the
aggregating router initiating the necessary aggregate path
message, as described in the following section.
The deaggregating router changes the e2e Path message's IP
Protocol from RSVP-E2E-IGNORE to IP Protocol RSVP, updates the
ADSPEC of the e2e Path using information accumulated by the
aggregate Path ADSPEC (if an aggregate Path has been
received), and the E2E PATH message is forwarded towards its
intended destination. To enable correct updating of the
ADSPEC, a deaggregating router may wait for the arrival of an
aggregate Path before forwarding the E2E Path.
2.4. Initiation of New Aggregate Path Message By Aggregating
router
The aggregating router is responsible to include the
SENDER_TSPEC information from individual E2E Path messages in
the SENDER_TSPEC of the aggregate Path message it sends to its
deaggregating router. The aggregating router may know that an
E2E session is associated with a given deaggregator when one
of two events occurs: it receives a PATH ERROR message with
the error code NEW-AGGREGATE-NEEDED from the deaggregator, or
it receives an E2E RESV message from the deaggregator. In the
latter case, the RESV contains a DCLASS object [DCLASS]
indicating which DSCP the deaggregator believes that the E2E
flow belongs in. In the former case, the aggregator must make
its own determination of a suitable DSCP based on the
information in the E2E Path message(s) being aggregated and
using locally available policy information. The identity of
the deaggregator itself is found in either the ERROR
SPECIFICATION of the Path Error message or the RSVP HOP object
of the RESV.
On receipt of either message, if no corresponding aggregate
Baker et al. Expiration: December 1999 [Page 13]
Draft RSVP Reservation Aggregation June 1999
path state exists from the aggregator to the deaggregator for
a session with the appropriate DSCP, and the aggregator is
configured to do so, the aggregator should generate an
aggregate PATH message for the aggregate reservation. The
destination address of the aggregate PATH message is the
address of the deaggregating router, and the message is sent
with IP protocol number RSVP.
2.5. Handling of E2E RESV Message by Deaggregating Router
Having sent the E2E PATH message on toward the destination,
the deaggregator must now expect to receive an E2E RESV for
the session. On receipt, its responsibility is to assure
itself that there is sufficient bandwidth reserved within the
aggregation region to support the new E2E reservation, and if
there is, then to forward the E2E RESV to the aggregating
router.
If there is insufficient bandwidth reserved, it should follow
the normal RSVP procedures [RSVP] for a reservation being
placed with insufficient bandwidth to support the reservation.
It may also immediately attempt to increase the aggregate
reservation that is supplying bandwidth by increasing the size
of the flowspec that it includes in the aggregate RESV that it
sends upstream.
When sufficient bandwidth is available, it may simply send the
E2E RESV message with IP Protocol RSVP to the aggregating
router. This message should, in addition to other data,
contain the DCLASS object to indicate which DSCP the
deaggregating router expects the aggregator to use. The choice
of DSCP may be made based on a combination of information in
the received E2E RESV and local policy. An example policy
might dictate a certain DSCP for Guaranteed Service and
another DSCP for Controlled Load. The de-aggregator will also
add the token bucket from the FLOWSPEC object into its
internal understanding of how much of that reservation is in
use.
2.6. Initiation of New Aggregate RESV Message By
Deaggregating Router
Upon receiving an E2E RESV message on an outside interface,
and having determined the appropriate DSCP for the session,
the deaggregator looks for corresponding path state for a
Baker et al. Expiration: December 1999 [Page 14]
Draft RSVP Reservation Aggregation June 1999
session with the chosen DSCP. If aggregate PATH state exists,
but no aggregate RESV state exists, the deaggregator creates
an aggregate RESV and sets its initial request to a value not
smaller than the requirement of the E2E reservation it is
supporting.
If no aggregate PATH state exists for the appropriate DSCP,
this may be because the aggregator has not yet responded to
the arrival of the E2E RESV sent in the preceding step. To
avoid deadlock while waiting for a response, it would be
desirable to use the acknowledgment mechanisms described in
[REFRESH].
Once the deaggregator has the aggregate PATH message, then it
sends an aggregate RESV message toward the aggregator (i.e.,
to the previous hop), using the AGGREGATED-RSVP session and
filter specifications. Since the DSCP is in the SESSION
object, the DCLASS is unnecessary. The message should be
reliably delivered using the mechanisms in [REFRESH] or,
alternatively, the CONFIRM object may be used, to assure that
the aggregate RESV does indeed arrive and is granted. This
enables the deaggregator to determine that the requested
bandwidth is available to allocate to the E2E flows it
supports.
2.7. Handling of Aggregate RESV Message by Interior Routers
The aggregate RESV message is handled in essentially the same
way as defined in [RSVP]. The Session object contains the
address of the deaggregating router (or the group address for
the session in the case of multicast) and the DSCP that has
been chosen for the session. The Filterspec object identifies
the aggregating router. These routers perform admission
control and resource allocation as usual and send the
aggregate RESV on towards the aggregator.
2.8. Handling of E2E RESV Message by Aggregating Router
The E2E RESV message is the final confirmation to the
aggregating router that a proportion of a given aggregate's
bandwidth has been reserved. At this point, it should assure
itself that the E2E reservation is associated with an
appropriate aggregate, that the aggregator and deaggregator
expectations synchronize, and that all things are in place. In
particular, it needs to ensure that the DCLASS carried in the
Baker et al. Expiration: December 1999 [Page 15]
Draft RSVP Reservation Aggregation June 1999
E2E RESV matches the DSCP for an aggregate session to that
deaggregator; if not, it needs to create a new aggregate Path
for the appropriate DSCP and send it to the deaggregator. It
should also assure itself that the SENDER_TSPEC from the E2E
PATH message has been accumulated into the appropriate
aggregate PATH message. Under normal circumstances, this is
the only way it will be informed of this association. It
should now forward the E2E RESV to its previous hop, following
normal RSVP processing rules [RSVP].
2.9. Removal of E2E Reservation
E2E reservations are removed in the usual way via PATH TEAR,
RESV TEAR, timeout, or as the result of an error condition.
When they are removed, their FLOWSPEC information must also be
removed from the allocated portion of the aggregate
reservation. This same bandwidth may be re-used for other
traffic in the near future. When E2E PATH messages are
removed, their SENDER_TSPEC information must also be removed
from the aggregate PATH.
2.10. Removal of Aggregate Reservation
Should an aggregate reservation go away (presumably due to a
configuration change, route change, or policy event), the E2E
reservations it supports are no longer active. They must be
treated accordingly.
2.11. Handling of Data On Reserved E2E Flow by Aggregating
Router
Prior to establishment that a given E2E flow is part of a
given aggregate, the flow's data should be treated as general
best effort traffic by whatever policies prevail for such.
Generally, this will mean being given the same throughput
behavior as non-essential traffic. However, upon establishing
that, the aggregating router is responsible to mark any
related traffic with the correct DSCP and forward it in the
manner appropriate to traffic on that reservation. This may
imply forwarding it to a given IP next hop, or piping it down
a given link layer circuit, tunnel, or MPLS label switched
path.
Baker et al. Expiration: December 1999 [Page 16]
Draft RSVP Reservation Aggregation June 1999
2.12. Procedures for Multicast Sessions
Because of the difficulties of aggregating multicast sessions
described above, we focus on the aggregation of scheduling and
classification state in the multicast case. The main
difference between the multicast and unicast cases is that
rather than sending an aggregate Path message to the unicast
address of a single deaggregating router, in the multicast
case we send the "aggregate" Path message to the same group
address as the E2E session. This ensures that the aggregate
Path message follows the same route as the E2E Path. This
difference between unicast and multicast is reflected in the
Session objects defined below. A consequence of this approach
is that we continue to have reservation state per multicast
session inside the aggregation region.
3. Protocol Elements
3.1. IP Protocol RSVP-E2E-IGNORE
This specification presumes the assignment of a protocol type
RSVP-E2E-IGNORE, whose number is at this point TBD. This is
used only on messages which require a router alert (PATH, PATH
ERROR, and RESV CONFIRM), and signifies that the message must
be treated one way when copied to an inside interface, and
another way when copied to an outside interface.
3.2. Path Error Code
A PATH ERROR code NEW-AGGREGATE-NEEDED is presumed. This value
does not signify that a terminal error has occurred, but that
an action is required of the aggregating router to avoid an
error condition in the near future.
3.3. SESSION Object
The SESSION object contains two values: the IP Address of the
aggregate session destination, and the DSCP that it will use
on the E2E data the reservation contains. For unicast
sessions, the session destination address is address of the
deaggregating router. For multicast sessions, the session
destination is the multicast address of the E2E session (or
sessions) being aggregated. The inclusion of the DSCP in the
Baker et al. Expiration: December 1999 [Page 17]
Draft RSVP Reservation Aggregation June 1999
session allows for multiple sessions toward the same address
to be distinguished by their DSCP and queued separately. It
also provides the means for aggregating scheduling and
classification state. In the case where a session uses a pair
of PHBs (e.g. AF11 and AF12), the DSCP used should represent
the numerically smallest PHB (e.g. AF11). This follows the
same naming convention described in [BRIM].
Session types are defined for IPv4 and IPv6 addresses.
o IP4 SESSION object: Class = SESSION,
C-Type = RSVP-AGGREGATE-IP4
+-------------+-------------+-------------+-------------+
| IPv4 Session Address (4 bytes) |
+-------------+-------------+-------------+-------------+
| /////////// | Flags | ///////// | DSCP |
+-------------+-------------+-------------+-------------+
o IP6 SESSION object: Class = SESSION,
C-Type = RSVP-AGGREGATE-IP6
+-------------+-------------+-------------+-------------+
| |
+ +
| |
+ IPv6 Session Address (16 bytes) +
| |
+ +
| |
+-------------+-------------+-------------+-------------+
| /////////// | Flags | ///////// | DSCP |
+-------------+-------------+-------------+-------------+
3.4. SENDER_TEMPLATE Object
The SENDER_TEMPLATE object identifies the aggregating router
for the aggregate reservation.
o IP4 SENDER_TEMPLATE object: Class = SENDER_TEMPLATE,
C-Type = RSVP-AGGREGATE-IP4
+-------------+-------------+-------------+-------------+
| IPv4 Aggregator Address (4 bytes) |
+-------------+-------------+-------------+-------------+
Baker et al. Expiration: December 1999 [Page 18]
Draft RSVP Reservation Aggregation June 1999
o IP6 SENDER_TEMPLATE object: Class = SENDER_TEMPLATE,
C-Type = RSVP-AGGREGATE-IP6
+-------------+-------------+-------------+-------------+
| |
+ +
| |
+ IPv6 Aggregator Address (16 bytes) +
| |
+ +
| |
+-------------+-------------+-------------+-------------+
3.5. FILTER_SPEC Object
The FILTER_SPEC object identifies the aggregating router for
the aggregate reservation, and is syntactically identical to
the SENDER_TEMPLATE object.
Baker et al. Expiration: December 1999 [Page 19]
Draft RSVP Reservation Aggregation June 1999
4. Policies and Algorithms For Predictive Management Of
Blocks Of Bandwidth
The exact policies used in determining how much bandwidth
should be allocated to an aggregate reservation at any given
time are beyond the scope of this document, and may be
proprietary to the service provider in question. However, here
we explore some of the issues and suggest approaches.
In short, the ideal condition is that the aggregate
reservation always has enough resources to allocate to any
flow reservation that requires its support, and never takes
too much. Simply stated, but more difficult to achieve.
Factors that come into account include significant times in
the diurnal cycle: one may find that a large number of people
start placing calls at 8:00 AM, even though the hour from 7:00
to 8:00 is dead calm. They also include recent history: if
more people have been placing calls recently than have been
finishing them, a prediction of the necessary bandwidth a few
moments hence may call for more bandwidth than is currently
allocated. Likewise, at the end of a busy period, we may find
that the trend calls for declining reservation amounts.
We recommend a policy something along this line. At any given
time, one should expect that the amount of bandwidth required
for the aggregate reservation is the larger of the following:
(a) a requirement known a priori, such as from history of the
diurnal cycle at a particular week day and time of day,
and
(b) the trend line over recent history, with 90 or 99%
statistical confidence.
We further expect that changes to that aggregate reservation
would be made no more often than every few minutes, and
ideally perhaps on larger granularity such as fifteen minute
intervals or hourly. The finer the granularity, the greater
the level of signaling required, while the coarser the
granularity, the greater the chance for error, and the need to
recover from that error.
In general, we expect that the aggregate reservation will not
ever add up to exactly the sum of the reservations it
supports, but rather will be an integer multiple of some block
reservation size, which exceeds that value.
Baker et al. Expiration: December 1999 [Page 20]
Draft RSVP Reservation Aggregation June 1999
5. Security Considerations
Numerous security issues pertain to this document; for
example, the loss of an aggregate reservation to an aggressor
causes many calls to operate unreserved, and the reservation
of a great excess of bandwidth may result in a denial of
service. However, these issues are not confined to this
extension: RSVP itself has them. We believe that the security
mechanisms in RSVP address these issues as well.
6. IANA Considerations
Beyond allocating an IP Protocol, a PATH ERROR code, and an
RSVP Addressing object "type", there are no IANA issues in
this document. We do not define an object that will itself
require assignment by IANA.
7. Acknowledgments
The authors acknowledge that published documents and
discussion with several people, notably John Wroclawski, Steve
Berson, and Andreas Terzis materially contributed to this
draft. The design derives directly from an internet draft by
Roch Guerin [GUERIN] and from Steve Berson's drafts on the
subject. It is also influenced by the design in the diff-edge
draft by Bernet et al [BERNET] and by the RSVP tunnels draft
[TERZIS].
Baker et al. Expiration: December 1999 [Page 21]
Draft RSVP Reservation Aggregation June 1999
8. References
[CSZ]
Clark, D., S. Shenker, and L. Zhang, "Supporting Real-
Time Applications in an Integrated Services Packet
Network: Architecture and Mechanism," in Proc.
SIGCOMM'92, September 1992.
[IP] RFC 791, "Internet Protocol". J. Postel. Sep-01-1981.
[HOSTREQ]
RFC 1122, "Requirements for Internet hosts -
communication layers". R.T. Braden. Oct-01-1989.
[FRAMEWORK]
Nichols, "Differentiated Services Operational Model and
Definitions", 02/11/1998, draft-nichols-dsopdef-00.txt
[PRINCIPLES]
RFC 1958, "Architectural Principles of the Internet". B.
Carpenter. June 1996.
[ASSURED]
Clark and Wroclawski, "An Approach to Service Allocation
in the Internet", 08/04/1997, draft-clark-diff-svc-
alloc-00.txt
[BROKER]
Nichols and Zhang, "A Two-bit Differentiated Services
Architecture for the Internet", 12/23/1997, draft-
nichols-diff-svc-arch-01.txt
[BERSON]
Berson and Vincent. "Aggregation of Internet Integrated
Services State". draft-berson-rsvp-aggregation-00.txt,
August 1998
[BRIM]
Brim and Carpenter. "Per Hop Behavior Identification
Codes". draft-brim-diffserv-phbid-00.txt, April 1999.
[ISDS]
Bernet et al. "Integrated Services Operation Over
Diffserv Networks". draft-ietf-issll-diffserv-rsvp-
02.txt, June 1999.
Baker et al. Expiration: December 1999 [Page 22]
Draft RSVP Reservation Aggregation June 1999
[GUERIN]
Guerin, R., Blake, S. and Herzog, S.,"Aggregating RSVP
based QoS Requests", Internet Draft, draft-guerin-
aggreg-rsvp-00.txt, November 1997.
[RSVP]
Braden, R., Zhang, L., Berson, S., Herzog, S. and Jamin,
S., "Resource Reservation Protocol (RSVP) Version 1
Functional Specification", RFC 2205, September 1997.
[BERNET]
Bernet, Y., Durham, D., and F. Reichmeyer, "Requirements
of Diff-serv Boundary Routers", Internet Draft, draft-
bernet-diffedge-01.txt, November, 1998.
[REFRESH]
Berger, L., Gan, D., and G. Swallow, "RSVP Refresh
Reduction Extensions", Internet Draft, draft-berger-
rsvp-refresh-reduct-02.txt, May 1999.
[TERZIS]
Terzis, A., Krawczyk, J., Wroclawski, J., and L. Zhang,
"RSVP Operation Over IP Tunnels", Internet Draft, draft-
ietf-rsvp-tunnel-04.txt, May 1999.
[DCLASS]
Bernet, Y., "Usage and Format of the DCLASS Object With
RSVP Signaling", Internet Draft, draft-bernet-dclass-
01.txt, June 1999.
9. Authors' Addresses
Fred Baker
Cisco Systems
519 Lado Drive
Santa Barbara, California 93111
Phone: (408) 526-4257
Email: fred@cisco.com
Carol Iturralde
Cisco Systems
250 Apollo Drive
Chelmsford MA,01824 USA
Phone: 978-244-8532
Email: cei@cisco.com
Baker et al. Expiration: December 1999 [Page 23]
Draft RSVP Reservation Aggregation June 1999
Francois Le Faucheur
Cisco Systems
291, rue Albert Caquot
06560 Valbonne, France
Phone: +33.1.6918 6266
Email: flefauch@cisco.com
Bruce Davie
Cisco Systems
250 Apollo Drive
Chelmsford MA,01824 USA
Phone: 978-244-8921
Email: bdavie@cisco.com
Baker et al. Expiration: December 1999 [Page 24]