Skip to main content

Mapping RTP streams to CLUE media captures
draft-ietf-clue-rtp-mapping-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 8849.
Expired & archived
Authors Roni Even , Jonathan Lennox
Last updated 2013-08-21 (Latest revision 2013-02-17)
RFC stream Internet Engineering Task Force (IETF)
Formats
Reviews
Additional resources Mailing list discussion
Stream WG state WG Document
Document shepherd (None)
IESG IESG state Became RFC 8849 (Proposed Standard)
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-ietf-clue-rtp-mapping-00
CLUE WG                                                          R. Even
Internet-Draft                                       Huawei Technologies
Intended status: Standards Track                               J. Lennox
Expires: August 21, 2013                                           Vidyo
                                                       February 17, 2013

               Mapping RTP streams to CLUE media captures
                   draft-ietf-clue-rtp-mapping-00.txt

Abstract

   This document describes mechanisms and recommended practice for
   mapping RTP media streams defined in SDP to CLUE media captures.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on August 21, 2013.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Even & Lennox            Expires August 21, 2013                [Page 1]
Internet-Draft             RTP mapping to CLUE             February 2013

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  RTP topologies for CLUE  . . . . . . . . . . . . . . . . . . .  3
   4.  Mapping CLUE Media Captures to RTP streams . . . . . . . . . .  5
     4.1.  Review of current directions in MMUSIC, AVText and
           AVTcore  . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.2.  Requirements of a solution . . . . . . . . . . . . . . . .  7
     4.3.  Static Mapping . . . . . . . . . . . . . . . . . . . . . .  9
     4.4.  Dynamic mapping  . . . . . . . . . . . . . . . . . . . . .  9
       4.4.1.  RTP header extension . . . . . . . . . . . . . . . . . 10
       4.4.2.  Restricted approach  . . . . . . . . . . . . . . . . . 10
     4.5.  Recommendations  . . . . . . . . . . . . . . . . . . . . . 11
   5.  Application to CLUE Media Requirements . . . . . . . . . . . . 11
   6.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     6.1.  Static mapping . . . . . . . . . . . . . . . . . . . . . . 13
     6.2.  Dynamic Mapping  . . . . . . . . . . . . . . . . . . . . . 16
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 16
   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 17
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 17
     10.2. Informative References . . . . . . . . . . . . . . . . . . 17
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19

Even & Lennox            Expires August 21, 2013                [Page 2]
Internet-Draft             RTP mapping to CLUE             February 2013

1.  Introduction

   Telepresence systems can send and receive multiple media streams.
   The CLUE framework [I-D.ietf-clue-framework] defines media captures
   as a source of Media, such as from one or more Capture Devices.  A
   Media Capture (MC) may be the source of one or more Media streams.  A
   Media Capture may also be constructed from other Media streams.  A
   middle box can express Media Captures that it constructs from Media
   streams it receives.

   SIP offer answer [RFC3264] uses SDP [RFC4566] to describe the
   RTP[RFC3550] media streams.  Each RTP stream has a unique SSRC within
   its RTP session.  The content of the RTP stream is created by an
   encoder in the endpoint.  This may be an original content from a
   camera or a content created by an intermediary device like an MCU.

   This document makes recommendations, for this telepresence
   architecture, about how RTP and RTCP streams should be encoded and
   transmitted, and how their relation to CLUE Media Captures should be
   communicated.  The proposed solution supports multiple RTP
   topologies.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC2119[RFC2119] and
   indicate requirement levels for compliant RTP implementations.

3.  RTP topologies for CLUE

   The typical RTP topologies used by Telepresence systems specify
   different behaviors for RTP and RTCP distribution.  A number of RTP
   topologies are described in
   [I-D.westerlund-avtcore-rtp-topologies-update].  For telepresence,
   the relevant topologies include point-to-point, as well as media
   mixers, media- switching mixers, and source-projection mixers.

   In the point-to-point topology, one peer communicates directly with a
   single peer over unicast.  There can be one or more RTP sessions, and
   each RTP session can carry multiple RTP streams identified by their
   SSRC.  All SSRCs will be recognized by the peers based on the
   information in the RTCP SDES report that will include the CNAME and
   SSRC of the sent RTP streams.  There are different point to point use
   cases as specified in CLUE use case
   [I-D.ietf-clue-telepresence-use-cases].  There may be a difference

Even & Lennox            Expires August 21, 2013                [Page 3]
Internet-Draft             RTP mapping to CLUE             February 2013

   between the symmetric and asymmetric use cases.  While in the
   symmetric use case the typical mapping will be from a Media capture
   device to a render device (e.g. camera to monitor) in the asymmetric
   case the render device may receive different capture information (RTP
   stream from a different camera) if it has fewer rendering devices
   (monitors).  In some cases, a CLUE session which, at a high-level, is
   point-to-point may nonetheless have RTP which is best described by
   one of the mixer topologies below.  For example, a CLUE endpoint can
   produce composited or switched captures for use by a receiving system
   with fewer displays than the sender has cameras.

   In the Media Mixer topology, the peers communicate only with the
   mixer.  The mixer provides mixed or composited media streams, using
   its own SSRC for the sent streams.  There are two cases here.  In the
   first case the mixer may have separate RTP sessions with each peer
   (similar to the point to point topology) terminating the RTCP
   sessions on the mixer; this is known as Topo-RTCP-Terminating MCU in
   [RFC5117].  In the second case, the mixer can use a conference-wide
   RTP session similar to RFC 5117's Topo-mixer or Topo-Video-switching.
   The major difference is that for the second case, the mixer uses
   conference-wide RTP sessions, and distributes the RTCP reports to all
   the RTP session participants, enabling them to learn all the CNAMEs
   and SSRCs of the participants and know the contributing source or
   sources (CSRCs) of the original streams from the RTP header.  In the
   first case, the Mixer terminates the RTCP and the participants cannot
   know all the available sources based on the RTCP information.  The
   conference roster information including conference participants,
   endpoints, media and media-id (SSRC) can be available using the
   conference event package [RFC4575] element.

   In the Media-Switching Mixer topology, the peer to mixer
   communication is unicast with mixer RTCP feedback.  It is
   conceptually similar to a compositing mixer as described in the
   previous paragraph, except that rather than compositing or mixing
   multiple sources, the mixer provides one or more conceptual sources
   selecting one source at a time from the original sources.  The Mixer
   creates a conference-wide RTP session by sharing remote SSRC values
   as CSRCs to all conference participants.

   In the Source-Projection Mixer topology, the peer to mixer
   communication is unicast with RTCP mixer feedback.  Every potential
   sender in the conference has a source which is "projected" by the
   mixer into every other session in the conference; thus, every
   original source is maintained with an independent RTP identity to
   every receiver, maintaining separate decoding state and its original
   RTCP SDES information.  However, RTCP is terminated at the mixer,
   which might also perform reliability, repair, rate adaptation, or
   transcoding on the stream.  Senders' SSRCs may be renumbered by the

Even & Lennox            Expires August 21, 2013                [Page 4]
Internet-Draft             RTP mapping to CLUE             February 2013

   mixer.  The sender may turn the projected sources on and off at any
   time, depending on which sources it thinks are most relevant for the
   receiver; this is the primary reason why this topology must act as an
   RTP mixer rather than as a translator, as otherwise these disabled
   sources would appear to have enormous packet loss.  Source switching
   is accomplished through this process of enabling and disabling
   projected sources, with the higher-level semantic assignment of
   reason for the RTP streams assigned externally.

   The above topologies demonstrate two major RTP/RTCP behaviors:

   1.  The mixer may either use the source SSRC when forwarding RTP
       packets, or use its own created SSRC.  Still the mixer will
       distribute all RTCP information to all participants creating
       conference-wide RTP session/s.  This allows the participants to
       learn the available RTP sources in each RTP session.  The
       original source information will be the SSRC or in the CSRC
       depending on the topology.  The point to point case behaves like
       this.

   2.  The mixer terminates the RTCP from the source, creating separate
       RTP sessions with the peers.  In this case the participants will
       not receive the source SSRC in the CSRC.  Since this is usually a
       mixer topology, the source information is available from the SIP
       conference event package [RFC4575].  Subscribing to the
       conference event package allows each participant to know the
       SSRCs of all sources in the conference.

4.  Mapping CLUE Media Captures to RTP streams

   The different topologies described in Section 3 support different
   SSRC distribution models and RTP stream multiplexing points.

   Most video conferencing systems today can separate multiple RTP
   sources by placing them into separate RTP sessions using, the SDP
   description.  For example, main and slides video sources are
   separated into separate RTP sessions based on the content attribute
   [RFC4796].  This solution works straightforwardly if the multiplexing
   point is at the UDP transport level, where each RTP stream uses a
   separate RTP session.  This will also be true for mapping the RTP
   streams to Media Captures if each media capture uses a separate RTP
   session, and the consumer can identify it based on the receiving RTP
   port.  In this case, SDP only needs to label the RTP session with an
   identifier that identifies the media capture in the CLUE description.
   In this case, it does not change the mapping even if the RTP session
   is switched using same or different SSRC.  (The multiplexing is not
   at the SSRC level).

Even & Lennox            Expires August 21, 2013                [Page 5]
Internet-Draft             RTP mapping to CLUE             February 2013

   Even though Session multiplexing is supported by CLUE, for scaling
   reasons, CLUE recommends using SSRC multiplexing in a single or
   multiple sessions.  So we need to look at how to map RTP streams to
   Media Captures when SSRC multiplexing is used.

   When looking at SSRC multiplexing we can see that in various
   topologies, the SSRC behavior may be different:

   1.  The SSRCs are static (assigned by the MCU/Mixer), and there is an
       SSRC for each media capture encoding defined in the CLUE
       protocol.  Source information may be conveyed using CSRC, or, in
       the case of topo-RTCP-Terminating MCU, is not conveyed.

   2.  The SSRCs are dynamic, representing the original source and are
       relayed by the Mixer/MCU to the participants.

   In the above two cases the MCU/Mixer creates its own advertisement,
   with a virtual room capture scene.

   Another case we can envision is that the MCU / Mixer relays all the
   capture scenes from all advertisements to all consumers.  This means
   that the advertisement will include multiple capture scenes, each
   representing a separate TP room with its own coordinate system.  A
   general tools for distributing roster information is by using an
   event package, for example by extending the conference event package.

4.1.  Review of current directions in MMUSIC, AVText and AVTcore

   Editor's note: This section provides an overview of the RFCs and
   drafts that can be used a base for a mapping solution.  This section
   is for information only, and if the WG thinks that it is the right
   direction, the authors will bring the required work to the relevant
   WGs.

   The solution needs to also support the simulcast case where more than
   one RTP session may be advertised for a Media Capture.  Support of
   such simulcast is out of scope for CLUE.

   When looking at the available tools based on current work in MMUSIC,
   AVTcore and AVText for supporting SSRC multiplexing the following
   documents are considered to be relevant.

   SDP Source attribute [RFC5576] mechanisms to describe specific
   attributes of RTP sources based on their SSRC.

   Negotiation of generic image attributes in SDP [RFC6236] provides the
   means to negotiate the image size.  The image attribute can be used
   to offer different image parameters like size but in order to offer

Even & Lennox            Expires August 21, 2013                [Page 6]
Internet-Draft             RTP mapping to CLUE             February 2013

   multiple RTP streams with different resolutions it does it using
   separate RTP session for each image option.

   [I-D.westerlund-avtcore-max-ssrc] proposes a signaling solution for
   how to use multiple SSRCs within one RTP session.

   [I-D.westerlund-avtext-rtcp-sdes-srcname] provides an extension that
   may be send in SDP, as an RTCP SDES information or as an RTP header
   extension that uniquely identifies a single media source.  It defines
   an hierarchical order of the SRCNAME parameter that can be used to
   for example to describe multiple resolution from the same source (see
   section 5.1 of [I-D.westerlund-avtcore-rtp-simulcast]).  Still all
   the examples are using RTP session multiplexing.

   Other documents reviewed by the authors but are currently not used in
   a proposed solution include:

   [I-D.lennox-mmusic-sdp-source-selection] specifies how participants
   in a multimedia session can request a specific source from a remote
   party.

   [I-D.westerlund-avtext-codec-operation-point](expired) extends the
   codec control messages by specifying messages that let participants
   communicate a set of codec configuration parameters.

   Using the above documents it is possible to negotiate the max number
   of received and sent RTP streams inside an RTP session (m-line or
   bundled m-line).  This allows also offering allowed combinations of
   codec configurations using different payload type numbers

   Examples: max-recv-ssrc:{96:2 & 97:3) where 96 and 96 are different
   payload type numbers.  Or max-send-ssrc{*:4}.

   In the next sections, the document will propose mechanisms to map the
   RTP streams to media captures addressing.

4.2.  Requirements of a solution

   This section lists, more briefly, the requirements a media
   architecture for Clue telepresence needs to achieve, summarizing the
   discussion of previous sections.  In this section, RFC 2119 [RFC2119]
   language refers to requirements on a solution, not an implementation;
   thus, requirements keywords are not written in capital letters.

   Media-1: It must not be necessary for a Clue session to use more than
   a single transport flow for transport of a given media type (video or
   audio).

Even & Lennox            Expires August 21, 2013                [Page 7]
Internet-Draft             RTP mapping to CLUE             February 2013

   Media-2: It must, however, be possible for a Clue session to use
   multiple transport flows for a given media type where it is
   considered valuable (for example, for distributed media, or
   differential quality-of-service).

   Media-3: It must be possible for a Clue endpoint or MCU to
   simultaneously send sources corresponding to static, to composited,
   and to switched captures, in the same transport flow.  (Any given
   device might not necessarily be able send all of these source types;
   but for those that can, it must be possible for them to be sent
   simultaneously.)

   Media-4: It must be possible for an original source to move among
   switched captures (i.e. at one time be sent for one switched capture,
   and at a later time be sent for another one).

   Media-5: It must be possible for a source to be placed into a
   switched capture even if the source is a "late joiner", i.e. was
   added to the conference after the receiver requested the switched
   source.

   Media-6: Whenever a given source is assigned to a switched capture,
   it must be immediately possible for a receiver to determine the
   switched capture it corresponds to, and thus that any previous source
   is no longer being mapped to that switched capture.

   Media-7: It must be possible for a receiver to identify the actual
   source that is currently being mapped to a switched capture, and
   correlate it with out-of-band (non-Clue) information such as rosters.

   Media-8: It must be possible for a source to move among switched
   captures without requiring a refresh of decoder state (e.g., for
   video, a fresh I-frame), when this is unnecessary.  However, it must
   also be possible for a receiver to indicate when a refresh of decoder
   state is in fact necessary.

   Media-9: If a given source is being sent on the same transport flow
   for more than one reason (e.g. if it corresponds to more than one
   switched capture at once, or to a static capture), it should be
   possible for a sender to send only one copy of the source.

   Media-10: On the network, media flows should, as much as possible,
   look and behave like currently-defined usages of existing protocols;
   established semantics of existing protocols must not be redefined.

   Media-11: The solution should seek to minimize the processing burden
   for boxes that distribute media to decoding hardware.

Even & Lennox            Expires August 21, 2013                [Page 8]
Internet-Draft             RTP mapping to CLUE             February 2013

   Media-12: If multiple sources from a single synchronization context
   are being sent simultaneously, it must be possible for a receiver to
   associate and synchronize them properly, even for sources that are
   are mapped to switched captures.

4.3.  Static Mapping

   Static mapping is widely used in current MCU implementations.  It is
   also common for a point to point symmetric use case when both
   endpoints have the same capabilities.  For capture encodings with
   static SSRCs, it is most straightforward to indicate this mapping
   outside the media stream, in the CLUE or SDP signaling.  An SDP
   source attribute [RFC5576] can be used to associate CLUE capture IDs
   with SSRCs in SDP.  Each SSRC will have a captureID value that will
   be specified also in the CLUE media capture as an attribute.  The
   provider advertisement could, if it wished, use the same SSRC for
   media capture encodings that are mutually exclusive.  (This would be
   natural, for example, if two advertised captures are implemented as
   different configurations of the same physical camera, zoomed in or
   out.).  Section 6 provide an example of an SDP offer and CLUE
   advertisement.

4.4.  Dynamic mapping

   Dynamic mapping by tagging each media packet with the capture ID.
   This means that a receiver immediately knows how to interpret
   received media, even when an unknown SSRC is seen.  As long as the
   media carries a known capture ID, it can be assumed that this media
   stream will replace the stream currently being received with that
   capture ID.

   This gives significant advantages to switching latency, as a switch
   between sources can be achieved without any form of negotiation with
   the receiver.  [RFC5285] recommends that header extensions must be
   used with caution.

   However, the disadvantage in using a capture ID in the stream that it
   introduces additional processing costs for every media packet, as
   capture IDs are scoped only within one hop (i.e., within a cascaded
   conference a capture ID that is used from the source to the first MCU
   is not meaningful between two MCUs, or between an MCU and a
   receiver), and so they may need to be added or modified at every
   stage.

   As capture IDs are chosen by the media sender, by offering a
   particular capture to multiple recipients with the same ID, this
   requires the sender to only produce one version of the stream
   (assuming outgoing payload type numbers match).  This reduces the

Even & Lennox            Expires August 21, 2013                [Page 9]
Internet-Draft             RTP mapping to CLUE             February 2013

   cost in the multicast case, although does not necessarily help in the
   switching case.

   An additional issue with putting capture IDs in the RTP packets comes
   from cases where a non-CLUE aware endpoint is being switched by an
   MCU to a CLUE endpoint.  In this case, we may require up to an
   additional 12 bytes in the RTP header, which may push a media packet
   over the MTU.  However, as the MTU on either side of the switch may
   not match, it is possible that this could happen even without adding
   extra data into the RTP packet.  The 12 additional bytes per packet
   could also be a significant bandwidth increase in the case of very
   low bandwidth audio codecs.

4.4.1.  RTP header extension

   The capture ID could be carried within the RTP header extension
   field, using [RFC5285].  This is negotiated within the SDP i.e.

   a=extmap:1 urn:ietf:params:rtp-hdrex:clue-capture-id

   Packets tagged by the sender with the capture ID will then contain a
   header extension as shown below

     0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |  ID=1 |  L=3  |                 capture id                    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |  capture id   |
       +-+-+-+-+-+-+-+-+

                  Figure : RTP header extension for encoding of the capture ID

   To add or modify the capture ID can be an expensive operation,
   particularly if SRTP is used to authenticate the packet.
   Modification to the contents of the RTP header requires a
   reauthentication of the complete packet, and this could prove to be a
   limiting factor in the throughput of a multipoint device.  However,
   it may be that reauthentication is required in any case due to the
   nature of SDP.  SDP permits the receiver to choose payload types,
   meaning that a similar option to modify the payload type in the
   packet header will cause the need to reauthenticate.

4.4.2.  Restricted approach

   The flaws of the Capture ID method (high latency switching of SSRC
   multiplexing, high computational cost on switching nodes) can be

Even & Lennox            Expires August 21, 2013               [Page 10]
Internet-Draft             RTP mapping to CLUE             February 2013

   mitigated by sending Capture ID only on some packets of a stream.  In
   this, the capture ID can be included in packets belonging to the
   first frame of media (typically an IDR/GDR) following a change in the
   dynamic mapping.  Following this, the SSRC is used to map sources to
   capture IDs.

   Note: in the dynamic case there is a need to verify how it will work
   if not all RTP streams of the same media type are multiplexed in a
   single RTP session.

4.5.  Recommendations

   The recommendation is that endpoints MUST support both the static
   declaration of capture encoding SSRCs, and the RTP header extension
   method of sharing capture IDs, with the extension in every media
   packet.  For low bandwidth situations, this may be considered
   excessive overhead; in which case endpoints MAY support the approach
   where capture IDs are sent selectively.  The SDP offer MAY specify
   the SSRC mapping to media capture.  In the case of static mapping
   topologies there will be no need to use the header extensions in the
   media, since the SSRC for the RTP stream will remain the same during
   the call unless a collision is detected and handled according to
   RFC5576 [RFC5576].  If the used topology uses dynamic mapping then
   the RTP header extension will be used to indicate the RTP stream
   switch for the media capture.  In this case the SDP description may
   be used to negotiate the initial SSRC but this will be left for the
   implementation.  Note that if the SSRC is defined explicitly in the
   SDP the SSRC collision should be handled as in RFC5576.

5.   Application to CLUE Media Requirements

   The requirement section Section 4.2 offers a number of requirements
   that are believed to be necessary for a CLUE RTP mapping.  The
   solutions described in this document are believed to meet these
   requirements, though some of them are only possible for some of the
   topologies.  (Since the requirements are generally of the form "it
   must be possible for a sender to do something", this is adequate; a
   sender which wishes to perform that action needs to choose a topology
   which allows the behavior it wants.

   In this section we address only those requirements where the
   topologies or the association mechanisms treat the requirements
   differently.

   Media-4: It must be possible for an original source to move among
   switched captures (i.e. at one time be sent for one switched capture,
   and at a later time be sent for another one).

Even & Lennox            Expires August 21, 2013               [Page 11]
Internet-Draft             RTP mapping to CLUE             February 2013

   This applies naturally for static sources with a Switched Mixer.  For
   dynamic sources with a Source-Projecting Mixer, this just requires
   the capture tag in the header extension element to be updated
   appropriately.

   Media-6: Whenever a given source is transmitted for a switched
   capture, it must be immediately possible for a receiver to determine
   the switched capture it corresponds to, and thus that any previous
   source is no longer being mapped to that switched capture.

   For a Switched Mixer, this applies naturally.  For a Source-
   Projecting mixer, this is done based on the header extension.

   Media-7: It must be possible for a receiver to identify the original
   source that is currently being mapped to a switched capture, and
   correlate it with out-of-band (non-Clue) information such as rosters.

   For a Switched Mixer, this is done based on the CSRC, if the mixer is
   providing CSRCs; if for a Source-Projecting Mixer, this is done based
   on the SSRC.

   Media-8: It must be possible for a source to move among switched
   captures without requiring a refresh of decoder state (e.g., for
   video, a fresh I-frame), when this is unnecessary.  However, it must
   also be possible for a receiver to indicate when a refresh of decoder
   state is in fact necessary.

   This can be done by a Source-Projecting Mixer, but not by a Switching
   Mixer.  The last requirement can be accomplished through an FIR
   message [RFC5104], though potentially a faster mechanism (not
   requiring a round-trip time from the receiver) would be preferable.

   Media-9: If a given source is being sent on the same transport flow
   to satisfy more than one capture (e.g. if it corresponds to more than
   one switched capture at once, or to a static capture as well as a
   switched capture), it should be possible for a sender to send only
   one copy of the source.

   For a Source-Projecting Mixer, this can be accomplished by sending
   multiple dynamic capture IDs for the same source; this can also be
   done for an environment with a hybrid of mixer topologies and static
   and dynamic captures, described below in Section 6.  It is not
   possible for static captures from a Switched Mixer.

   Media-12: If multiple sources from a single synchronization context
   are being sent simultaneously, it must be possible for a receiver to
   associate and synchronize them properly, even for sources that are
   mapped to switched captures.

Even & Lennox            Expires August 21, 2013               [Page 12]
Internet-Draft             RTP mapping to CLUE             February 2013

   For a Mixed or Switched Mixer topology, receivers will see only a
   single synchronization context (CNAME), corresponding to the mixer.
   For a Source-Projecting Mixer, separate projecting sources keep
   separate synchronization contexts based on their original CNAMEs,
   thus allowing independent synchronization of sources from independent
   rooms without needing global synchronization.  In hybrid cases,
   however (e.g. if audio is mixed), all sources which need to be
   synchronized with the mixed audio must get the same CNAME (and thus a
   mixer-provided timebase) as the mixed audio.

6.  Examples

   It is possible for a CLUE device to send multiple instances of the
   topologies in Section 3 simultaneously.  For example, an MCU which
   uses a traditional audio bridge with switched video would be a Mixer
   topology for audio, but a Switched Mixer or a Source-Projecting Mixer
   for video.  In the latter case, the audio could be sent as a static
   source, whereas the video could be dynamic.

   More notably, it is possible for an endpoint to send the same sources
   both for static and dynamic captures.  Consider the example in
   Section 11.1 of [I-D.ietf-clue-framework], where an endpoint can
   provide both three cameras (VC0, VC1, and VC2) for left, center, and
   right views, as well as a switched view (VC3) of the loudest panel.

   It is possible for a consumer to request both the (VC0 - VC2) set and
   VC3.  It is worth noting that the content of VC3 is, at all times,
   exactly the content of one of VC0, VC1, or VC2.  Thus, if the sender
   uses the Source-Selection Mixer topology for VC3, the consumer that
   receives these three sources would not need to send any additional
   media traffic over just sending (VC0 - VC2).

   In this case, the advertiser could describe VC0, VC1, and VC2 in its
   initial advertisement or SDP with static SSRCs, whereas VC3 would
   need to be dynamic.  The role of VC3 would move among VC0, VC1, or
   VC2, indicated by the RTP header extension on those streams' RTP
   packets.

6.1.  Static mapping

   Using the video capture example from the framework for a three camera
   system with four monitors where one is for the presentation stream
   [I-D.ietf-clue-framework] document:

   o  VC0- (the camera-left camera stream, purpose=main, switched:no

Even & Lennox            Expires August 21, 2013               [Page 13]
Internet-Draft             RTP mapping to CLUE             February 2013

   o  VC1- (the center camera stream, purpose=main, switched:no

   o  VC2- (the camera-right camera stream), purpose=main, switched:no

   o  VC3- (the loudest panel stream), purpose=main, switched:yes

   o  VC4- (the loudest panel stream with PiPs), purpose=main,
      composed=true; switched:yes

   o  VC5- (the zoomed out view of all people in the room),
      purpose=main, composed=no; switched:no

   o  VC6- (presentation stream), purpose=presentation, switched:no

   Where the physical simultaneity information is:

      {VC0, VC1, VC2, VC3, VC4, VC6}

      {VC0, VC2, VC5, VC6}

   In this case the provider can send up to six simultaneous streams and
   receive four one for each monitor.  This is the maximum case but it
   can be further limited by the capture scene entries which may propose
   sending only three camera streams and one presentation, still since
   the consumer can select any media captures that can be sent
   simultaneously the offer will specify 6 streams where VC5 and VC1 are
   using the same resource and are mutually exclusive.

   In the Advertisement there may be two capture scenes:

   The first capture scene may have four entries:

      {VC0, VC1, VC2}

      {VC3}

      {VC4}

      {VC5}

   The second capture scene will have the following single entry.

   {VC6}

   We assume that an intermediary will need to look at CLUE if want to
   have better decision on handling specific RTP streams for example
   based on them being part of the same capture scene so the SDP will
   not group streams by capture scene.

Even & Lennox            Expires August 21, 2013               [Page 14]
Internet-Draft             RTP mapping to CLUE             February 2013

   The SIP offer may be

      m=video 49200 RTP/AVP 99

      a=extmap:1 urn:ietf:params:rtp-hdrex:clue-capture-id / for support
      of dynamic mapping

      a=rtpmap:99 H264/90000

      a=max-send-ssrc:{*:6}

      a=max-recv-ssrc:{*:4}

      a=ssrc:11111 CaptureID:1

      a=ssrc:22222 CaptureID:2

      a=ssrc:33333 CaptureID:3

      a=ssrc:44444 CaptureID:4

      a=ssrc:55555 CaptureID:5

      a=ssrc:66666 CaptureID:6

   In the above example the provider can send up to five main streams
   and one presentation stream.

   We define a new Media Capture ID attribute CaptureID which will have
   the mapping of the related RTP stream

   Note that VC1 and VC5 have the same SSRC since they are using the
   same resource.

   o  VC0- (the camera-left camera stream, purpose=main, switched:no,
      CaptureID =1

   o  VC1- (the center camera stream, purpose=main, switched:no,
      CaptureID =2

   o  VC2- (the camera-right camera stream), purpose=main, switched:no,
      CaptureID =3

   o  VC3- (the loudest panel stream), purpose=main, switched:yes,
      CaptureID =4

   o  VC4- (the loudest panel stream with PiPs), purpose=main,
      composed=true; switched:yes, CaptureID =5

Even & Lennox            Expires August 21, 2013               [Page 15]
Internet-Draft             RTP mapping to CLUE             February 2013

   o  VC5- (the zoomed out view of all people in the room),
      purpose=main, composed=no; switched:no, CaptureID =2

   o  VC6- (presentation stream), purpose=presentation, switched:no,
      CaptureID =6

   Note: We can allocate an SSRC for each MC which will not require the
   indirection of using a CaptureId.  This will require if a switch to
   dynamic is done to provide information about which SSRC is being
   replaced by the new one.

6.2.  Dynamic Mapping

   For topologies that use dynamic mapping there is no need to provide
   the SSRCs in the offer (they may not be available if the offers from
   the sources will not include them when connecting to the mixer or
   remote endpoint) In this case the captureID (srcname) will be
   specified first in the advertisement.

   The SIP offer may be

      m=video 49200 RTP/AVP 99

      a=extmap:1 urn:ietf:params:rtp-hdrex:clue-capture-id

      a=rtpmap:99 H264/90000

      a=max-send-ssrc:{*:4}

      a=max-recv-ssrc:{*:4}

   This will work for ssrc multiplex.  It is not clear how it will work
   when RTP streams of the same media are not multiplexed in a single
   RTP session.  How to know which encoding will be in which of the
   different RTP sessions.

7.  Acknowledgements

   The authors would like to thanks Allyn Romanow and Paul Witty for
   contributing text to this work.

8.  IANA Considerations

   TBD

Even & Lennox            Expires August 21, 2013               [Page 16]
Internet-Draft             RTP mapping to CLUE             February 2013

9.  Security Considerations

   TBD.

10.  References

10.1.  Normative References

   [I-D.ietf-clue-framework]
              Romanow, A., Duckworth, M., Pepperell, A., and B. Baldino,
              "Framework for Telepresence Multi-Streams",
              draft-ietf-clue-framework-06 (work in progress),
              July 2012.

   [I-D.lennox-clue-rtp-usage]
              Lennox, J., Witty, P., and A. Romanow, "Real-Time
              Transport Protocol (RTP) Usage for Telepresence Sessions",
              draft-lennox-clue-rtp-usage-04 (work in progress),
              June 2012.

   [I-D.westerlund-avtcore-max-ssrc]
              Westerlund, M., Burman, B., and F. Jansson, "Multiple
              Synchronization sources (SSRC) in RTP Session Signaling",
              draft-westerlund-avtcore-max-ssrc-02 (work in progress),
              July 2012.

   [I-D.westerlund-avtext-rtcp-sdes-srcname]
              Westerlund, M., Burman, B., and P. Sandgren, "RTCP SDES
              Item SRCNAME to Label Individual Sources",
              draft-westerlund-avtext-rtcp-sdes-srcname-01 (work in
              progress), July 2012.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

10.2.  Informative References

   [I-D.ietf-clue-telepresence-use-cases]
              Romanow, A., Botzko, S., Duckworth, M., Even, R., and I.
              Communications, "Use Cases for Telepresence Multi-
              streams", draft-ietf-clue-telepresence-use-cases-04 (work
              in progress), August 2012.

   [I-D.lennox-mmusic-sdp-source-selection]
              Lennox, J. and H. Schulzrinne, "Mechanisms for Media
              Source Selection in the Session Description Protocol
              (SDP)", draft-lennox-mmusic-sdp-source-selection-04 (work

Even & Lennox            Expires August 21, 2013               [Page 17]
Internet-Draft             RTP mapping to CLUE             February 2013

              in progress), March 2012.

   [I-D.westerlund-avtcore-rtp-simulcast]
              Westerlund, M., Burman, B., Lindqvist, M., and F. Jansson,
              "Using Simulcast in RTP sessions",
              draft-westerlund-avtcore-rtp-simulcast-01 (work in
              progress), July 2012.

   [I-D.westerlund-avtcore-rtp-topologies-update]
              Westerlund, M. and S. Wenger, "RTP Topologies",
              draft-westerlund-avtcore-rtp-topologies-update-01 (work in
              progress), October 2012.

   [I-D.westerlund-avtext-codec-operation-point]
              Westerlund, M., Burman, B., and L. Hamm, "Codec Operation
              Point RTCP Extension",
              draft-westerlund-avtext-codec-operation-point-00 (work in
              progress), March 2012.

   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
              with Session Description Protocol (SDP)", RFC 3264,
              June 2002.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
              Description Protocol", RFC 4566, July 2006.

   [RFC4575]  Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session
              Initiation Protocol (SIP) Event Package for Conference
              State", RFC 4575, August 2006.

   [RFC4796]  Hautakorpi, J. and G. Camarillo, "The Session Description
              Protocol (SDP) Content Attribute", RFC 4796,
              February 2007.

   [RFC5104]  Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
              "Codec Control Messages in the RTP Audio-Visual Profile
              with Feedback (AVPF)", RFC 5104, February 2008.

   [RFC5117]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
              January 2008.

   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
              Header Extensions", RFC 5285, July 2008.

Even & Lennox            Expires August 21, 2013               [Page 18]
Internet-Draft             RTP mapping to CLUE             February 2013

   [RFC5576]  Lennox, J., Ott, J., and T. Schierl, "Source-Specific
              Media Attributes in the Session Description Protocol
              (SDP)", RFC 5576, June 2009.

   [RFC6236]  Johansson, I. and K. Jung, "Negotiation of Generic Image
              Attributes in the Session Description Protocol (SDP)",
              RFC 6236, May 2011.

Authors' Addresses

   Roni Even
   Huawei Technologies
   Tel Aviv,
   Israel

   Email: roni.even@mail01.huawei.com

   Jonathan Lennox
   Vidyo, Inc.
   433 Hackensack Avenue
   Seventh Floor
   Hackensack, NJ  07601
   US

   Email: jonathan@vidyo.com

Even & Lennox            Expires August 21, 2013               [Page 19]