Skip to main content

A Taxonomy of Grouping Semantics and Mechanisms for Real-Time Transport Protocol (RTP) Sources
draft-lennox-raiarea-rtp-grouping-taxonomy-01

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Replaced".
Authors Jonathan Lennox , Kevin Gross , Suhas Nandakumar , Gonzalo Salgueiro , Bo Burman
Last updated 2013-07-15
Replaced by draft-ietf-avtext-rtp-grouping-taxonomy, draft-ietf-avtext-rtp-grouping-taxonomy, RFC 7656
RFC stream (None)
Formats
Additional resources
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-lennox-raiarea-rtp-grouping-taxonomy-01
Network Working Group                                          J. Lennox
Internet-Draft                                                     Vidyo
Intended status: Informational                                  K. Gross
Expires: January 16, 2014                                            AVA
                                                           S. Nandakumar
                                                            G. Salgueiro
                                                           Cisco Systems
                                                               B. Burman
                                                                Ericsson
                                                           July 15, 2013

A Taxonomy of Grouping Semantics and Mechanisms for Real-Time Transport
                         Protocol (RTP) Sources
             draft-lennox-raiarea-rtp-grouping-taxonomy-01

Abstract

   The terminology about, and associations among, Real-Time Transport
   Protocol (RTP) sources can be complex and somewhat opaque.  This
   document describes a number of existing and proposed relationships
   among RTP sources, and attempts to define common terminology for
   discussing protocol entities and their relationships.

   This document is still very rough, but is submitted in the hopes of
   making future discussion productive.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 16, 2014.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

Lennox, et al.          Expires January 16, 2014                [Page 1]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Concepts  . . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  End Point . . . . . . . . . . . . . . . . . . . . . . . .   4
       2.1.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   4
       2.1.2.  Characteristics . . . . . . . . . . . . . . . . . . .   4
     2.2.  Capture Device  . . . . . . . . . . . . . . . . . . . . .   4
       2.2.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   4
       2.2.2.  Characteristics . . . . . . . . . . . . . . . . . . .   5
     2.3.  Media Source  . . . . . . . . . . . . . . . . . . . . . .   5
       2.3.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   5
       2.3.2.  Characteristics . . . . . . . . . . . . . . . . . . .   5
     2.4.  Media Stream  . . . . . . . . . . . . . . . . . . . . . .   6
       2.4.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   6
       2.4.2.  Characteristics . . . . . . . . . . . . . . . . . . .   6
     2.5.  Media Provider  . . . . . . . . . . . . . . . . . . . . .   6
       2.5.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   7
       2.5.2.  Characteristics . . . . . . . . . . . . . . . . . . .   7
     2.6.  RTP Session . . . . . . . . . . . . . . . . . . . . . . .   7
       2.6.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   7
       2.6.2.  Characteristics . . . . . . . . . . . . . . . . . . .   7
     2.7.  Media Transport . . . . . . . . . . . . . . . . . . . . .   8
       2.7.1.  Characteristics . . . . . . . . . . . . . . . . . . .   8
     2.8.  Rendering Device  . . . . . . . . . . . . . . . . . . . .   8
       2.8.1.  Characteristics . . . . . . . . . . . . . . . . . . .   8
     2.9.  Media Renderer  . . . . . . . . . . . . . . . . . . . . .   8
       2.9.1.  Alternate Usages  . . . . . . . . . . . . . . . . . .   8
       2.9.2.  Characteristics . . . . . . . . . . . . . . . . . . .   9
     2.10. Participant . . . . . . . . . . . . . . . . . . . . . . .   9
       2.10.1.  Characteristics  . . . . . . . . . . . . . . . . . .   9
     2.11. Multimedia Session  . . . . . . . . . . . . . . . . . . .   9
       2.11.1.  Alternate Usages . . . . . . . . . . . . . . . . . .   9
       2.11.2.  Characteristics  . . . . . . . . . . . . . . . . . .  10
     2.12. Communication Session . . . . . . . . . . . . . . . . . .  10
       2.12.1.  Alternate Usages . . . . . . . . . . . . . . . . . .  10
       2.12.2.  Characteristics  . . . . . . . . . . . . . . . . . .  10
   3.  Relationships . . . . . . . . . . . . . . . . . . . . . . . .  10

Lennox, et al.          Expires January 16, 2014                [Page 2]
Internet-Draft            RTP Grouping Taxonomy                July 2013

     3.1.  Synchronization Context . . . . . . . . . . . . . . . . .  11
       3.1.1.  RTCP CNAME  . . . . . . . . . . . . . . . . . . . . .  12
       3.1.2.  Clock Source Signaling  . . . . . . . . . . . . . . .  12
       3.1.3.  CLUE Scenes . . . . . . . . . . . . . . . . . . . . .  12
       3.1.4.  Implicitly via RtcMediaStream . . . . . . . . . . . .  12
       3.1.5.  Explicitly via SDP Mechanisms . . . . . . . . . . . .  12
     3.2.  Containment Context . . . . . . . . . . . . . . . . . . .  12
       3.2.1.  Media Stream Multiplexing . . . . . . . . . . . . . .  13
       3.2.2.  RTP Session Multiplexing  . . . . . . . . . . . . . .  13
       3.2.3.  Multiple Media Sources in a WebRTC PeerConnection . .  13
     3.3.  Equivalence Context . . . . . . . . . . . . . . . . . . .  13
       3.3.1.  Simulcast . . . . . . . . . . . . . . . . . . . . . .  14
       3.3.2.  Layered MultiStream Transmission  . . . . . . . . . .  14
       3.3.3.  Robustness and Repair . . . . . . . . . . . . . . . .  15
       3.3.4.  SDP FID Semantics . . . . . . . . . . . . . . . . . .  17
     3.4.  Session Context . . . . . . . . . . . . . . . . . . . . .  17
       3.4.1.  Point-to-Point Session  . . . . . . . . . . . . . . .  18
       3.4.2.  Full Mesh Session . . . . . . . . . . . . . . . . . .  19
       3.4.3.  Centralized Conference Session  . . . . . . . . . . .  20
   4.  Security Considerations . . . . . . . . . . . . . . . . . . .  20
   5.  Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .  21
   6.  Open Issues . . . . . . . . . . . . . . . . . . . . . . . . .  21
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  21
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  21
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  21
     8.2.  Informative References  . . . . . . . . . . . . . . . . .  21
   Appendix A.  Changes From Earlier Versions  . . . . . . . . . . .  23
     A.1.  Changes From Draft -00  . . . . . . . . . . . . . . . . .  23
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  23

1.  Introduction

   The existing taxonomy of sources in RTP is often regarded as
   confusing and inconsistent.  Consequently, a deep understanding of
   how the different terms relate to each other becomes a real
   challenge.  Frequently cited examples of this confusion are (1) how
   different protocols that make use of RTP use the same terms to
   signify different things and (2) how the complexities addressed at
   one layer are often glossed over or ignored at another.

   This document attempts to provide some clarity by reviewing the
   semantics of various aspects of sources in RTP.  As an organizing
   mechanism, it approaches this by describing various ways that RTP
   sources can be grouped and associated together.

2.  Concepts

Lennox, et al.          Expires January 16, 2014                [Page 3]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   This section defines concepts that serve to identify various
   components in a given RTP usage.  For each concept an attempt is made
   to list any alternate definitions and usages that co-exist today
   along with various characteristics that further describes the
   concept.

   All references to ControLling mUltiple streams for tElepresence
   (CLUE) in this document map to [I-D.ietf-clue-framework] and all
   references to Web Real-Time Communications (WebRTC) map to
   [I-D.ietf-rtcweb-overview].

2.1.  End Point

   A single entity sending or receiving RTP packets.  It may be
   decomposed into several functional blocks, but as long as it behaves
   as a single RTP stack entity it is classified as a single "End
   Point".

2.1.1.  Alternate Usages

   The CLUE Working Group (WG) uses the terms "Media Provider" and
   "Media Consumer" to describes aspects of End Point pertaining to
   sending and receiving functionalities.

2.1.2.  Characteristics

   End Points can be identified in several different ways.  While RTCP
   Canonical Names (CNAMEs) [RFC3550] provide a globally unique and
   stable identification mechanism for the duration of the Communication
   Session (See Section 2.12), their validity applies exclusively within
   a synchronization context.  Therefore, a mechanisms outside the scope
   of RTP, such as an application defined mechanisms, must be depended
   upon to ensure End Point identification when outside this
   synchronization context.

2.2.  Capture Device

   The physical source of stream of media data of one type such as
   camera or microphone.

2.2.1.  Alternate Usages

   The CLUE WG uses the term "Capture Device" to identify a physical
   capture device.

   WebRTC WG uses the term "Recording Device" to refer to the locally
   available capture devices in an end-system.

Lennox, et al.          Expires January 16, 2014                [Page 4]
Internet-Draft            RTP Grouping Taxonomy                July 2013

2.2.2.  Characteristics

   o  A Capture Device is identified either by hardware/manufacturer ID
      or via a session-scoped device identifier as mandated by the
      application usage.

   o  A Capture Device always corresponds to a Media Source (See
      Section 2.3 for a definition of this term) but vice-versa might
      not always be true.  For example, in the cases of output from a
      media production function (i.e., an audio mixer) or a video
      editing function which can represent data from several Media
      Sources.

2.3.  Media Source

   A Media Source logically defines the source of a raw stream of media
   data as generated either by a single capture device or by a
   conceptual source.  A Media Source represents an Audio Source or a
   Video Source.

2.3.1.  Alternate Usages

   The CLUE WG uses the term "Media Capture" for this purpose.  A CLUE
   Media Capture is identified via indexed notation.  The terms Audio
   Capture and Video Capture are used to identify Audio Sources and
   Video Sources respectively.  Concepts such as "Capture Scene",
   "Capture Scene Entry" and "Capture" provide a flexible framework to
   represent media captured spanning spatial regions.

   The WebRTC WG defines the term "RtcMediaStreamTrack" to refer to a
   Media Source.  An "RtcMediaStreamTrack" is identified by the ID
   attribute on it.

   Typically a Media Source is mapped to a single m=line via the Session
   Description Protocol (SDP) [RFC4566] unless mechanisms such as
   Source-Specific attributes are in place [RFC5576].  In the latter
   cases, an m=line can represent either multiple Media Sources or
   multiple Media Streams (See Section 2.4 for a definition of this
   term).

2.3.2.  Characteristics

   o  A Media Source represents a real-time source of raw stream of
      audio or video media data.

   o  At any point, it can represent a physical capture source or
      conceptual source.

Lennox, et al.          Expires January 16, 2014                [Page 5]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   o  Typically raw media from a Media Source is compressed via the
      application of an appropriate encoding mechanism, thus creating an
      RTP payload for Media Streams (See Section 2.4 for a definition of
      this term).

   o  Multiple transformations can be applied to the data from a Media
      Source, thus creating several Media Streams.

   o  Some notable transformations are described in Section 3.3.

2.4.  Media Stream

   Media from a Media Source is encoded and packetized to produce one or
   more Media Streams representing a sequence of RTP packets.

2.4.1.  Alternate Usages

   The term "Stream" is used by the CLUE WG to define a encoded Media
   Source sent via RTP.  "Capture Encoding", "Encoding Groups" are
   defined to capture specific details of the encoding scheme.

   RFC3550 [RFC3550] uses the term Source for this purpose.

   The equivalent mapping of Media Stream in SDP [RFC4566] is defined
   per usage.  For example, each m=line can describe one Media Stream
   and hence one Media Source OR a single m=line can describe properties
   for multiple Media Streams (via [RFC5576] mechanisms for example).

2.4.2.  Characteristics

   o  Each Media Stream is identified by a unique Synchronization source
      (SSRC) [RFC3550] that is carried in every RTP and Real-time
      Transport Control Protocol (RTCP) packet header.

   o  At any given point, an Media Stream can have one and only SSRC.

   o  Each Media Stream defines a unique RTP sequence numbering and
      timing space.

   o  Several Media Streams could potentially map to a single Media
      Source via the source transformations (See Section 3.3).

   o  Several Media Streams can be carried over a single RTP Session.

2.5.  Media Provider

Lennox, et al.          Expires January 16, 2014                [Page 6]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   A Media Provider is a logical component within the RTP Stack that is
   responsible for encoding the media data from one or more Media
   Sources to generate RTP Payload for the outbound Media Streams.

2.5.1.  Alternate Usages

   Within the SDP usage, an m=line describes the necessary configuration
   required for encoding purposes.

   CLUE's "Capture Encoding" provides specific encoding configuration
   for this purpose.

   WebRTC WG uses the term "RtcMediaStreamTrack" to qualify as source of
   the media data that is encoded via the Media Provider.

2.5.2.  Characteristics

   o  A Media Source can be multiply encoded by a given Media Provider
      on-the-fly by allowing various encoded representations.

2.6.  RTP Session

   An RTP session is an association among a group of participants
   communicating with RTP.  It is a group communications channel which
   can potentially carry a number of Media Streams.  Within an RTP
   session, every participant finds out meta-data and control
   information (over RTCP) about all the Media Streams in the RTP
   session.  The bandwidth of the RTCP control channel is shared within
   an RTP Session.

2.6.1.  Alternate Usages

   Within the context of SDP a singe m=line can map to a single RTP
   Session or multiple m=lines can map to a single RTP Session.  The
   latter is enabled via multiplexing schemes such as BUNDLE
   [I-D.ietf-mmusic-sdp-bundle-negotiation], for example, that allows
   mapping of multiple m=lines to a single RTP Session.

2.6.2.  Characteristics

   o  Typically an RTP Session can carry one ore more Media Streams, the
      latter is also termed "SSRC Multiplexing".

   o  Each RTP Session is carried by a single underlying Media Transport
      unless multiple RTP sessions are multiplexed over a single
      Transport Flow.  Such a scheme is alternatively called "Session
      Multiplexing" in the RTP context
      [I-D.westerlund-avtcore-transport-multiplexing].

Lennox, et al.          Expires January 16, 2014                [Page 7]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   o  An RTP Session shares a single SSRC space as defined in RFC3550
      [RFC3550].  That is, those End Points can see an SSRC identifier
      transmitted by any of the other End Points.  An End Point can
      receive an SSRC either as SSRC or as a Contributing source (CSRC)
      in RTP and RTCP packets, as defined by the endpoints' network
      interconnection topology.

   o  Multiple RTP Sessions can be related to one another via mechanisms
      defined in Section 3.

2.7.  Media Transport

   A Media Transport defines an end-to-end transport association for
   carrying one or more RTP Sessions.  The combination of a network
   address and port uniquely identifies such a transport association,
   for example an IP address and a UDP port.

2.7.1.  Characteristics

   o  Media Transport transmits RTP Packets from a source transport
      address to a destination transport address.

   o  RTP may depend upon the lower-layer protocol to provide mechanism
      such as ports to multiplex the RTP and RTCP packets of an RTP
      Session.

2.8.  Rendering Device

   Represents a physical rendering device such display or speaker.

2.8.1.  Characteristics

   o  An End Point can potentially have multiple rendering devices of
      each type.

   o  Incoming Media Streams are decoded by one or more Media Renderers
      to provide a representation suitable for rendering the media data
      over one or more Rendering Devices, as defined by the application
      usage or system-wide configuration.

2.9.  Media Renderer

   A Media Renderer is a logical component within the RTP Stack that is
   responsible for decoding the RTP Payload within the incoming Media
   Streams to generate media data suitable for eventual rendering.

2.9.1.  Alternate Usages

Lennox, et al.          Expires January 16, 2014                [Page 8]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   Within the context of SDP, an m=line describes the necessary
   configuration required to decode either one or more incoming Media
   Streams.

   The WebRTC WG uses the term "RtcMediaStreamTrack" to qualify the
   media data decoded via the Media Renderer corresponding to the
   incoming Media Stream.

2.9.2.  Characteristics

   o  The output from the Media Renderer is usually rendered to a
      Rendering Device via appropriate mechanisms as explained in
      Section 2.8

   o  Incoming Media Streams decoded by the Media Renderer are typically
      identified via the SSRC.

2.10.  Participant

   A participant is an entity reachable by a single signaling address,
   and is thus related more to the signaling context than to the media
   context.

2.10.1.  Characteristics

   o  A single signaling-addressable entity, using an application-
      specific signaling address space, for example a SIP URI.

   o  A participant can have several associated transport flows,
      including several separate local transport addresses for those
      transport flows.

   o  A participant can have several multimedia sessions.

2.11.  Multimedia Session

   A multimedia session is an association among a group of participants
   engaged in the conversation via one or more RTP Sessions.  It defines
   logical relationships among Media Sources that appear in multiple RTP
   Sessions.

2.11.1.  Alternate Usages

   RFC4566 [RFC4566] defines a multimedia session as a set of multimedia
   senders and receivers and the data streams flowing from senders to
   receivers.

Lennox, et al.          Expires January 16, 2014                [Page 9]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   RFC3550 [RFC3550] defines it as set of concurrent RTP sessions among
   a common group of participants.  For example, a videoconference
   (which is a multimedia session) may contain an audio RTP session and
   a video RTP session.

2.11.2.  Characteristics

   o  Participants in RTP multimedia sessions are identified via
      mechanisms such as RTCP CNAME or other application level
      identifiers as appropriate.

   o  A multimedia session can be composed of several parallel RTP
      Sessions with potentially multiple Media Streams per RTP Session.

   o  Each participant in a multimedia sessions can have multitude of
      Media Captures and Media Rendering devices.

2.12.  Communication Session

   A communication session is an association among group of participants
   communicating with each other via a set of multimedia sessions.

2.12.1.  Alternate Usages

   The Session Description Protocol RFC4566 [RFC4566]defines a
   multimedia session as a set of multimedia senders and receivers and
   the data streams flowing from senders to receivers.  In that
   definition it is however not clear if a multimedia session includes
   both the sender's and the receiver's view of the same RTP Stream.

2.12.2.  Characteristics

   o  Each participant in a Communication Session is identified via an
      application-specific signaling address.

   o  A Communication Session is composed of at least one multimedia
      session per participant, involving one or more parallel RTP
      Sessions with potentially multiple Media Streams per RTP Session.

   For example, in a full mesh communication, the Communication Session
   consists of a set of separate Multimedia Sessions between each pair
   of Participants.  Another example is a centralized conference, where
   the Communication Session consists of a set of Multimedia Sessions
   between each Participant and the conference handler.

3.  Relationships

Lennox, et al.          Expires January 16, 2014               [Page 10]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   This section provides various relationships that can co-exist between
   the aforementioned concepts in a given RTP usage.  Using Unified
   Modeling Language (UML) class diagrams [UML], Figure 1 below depicts
   general relations between a Media Source, its Media Provider(s) and
   the resulting Media Stream(s).

      Note: The RTCP Stream related to the RTP Stream is not shown in
      the figure.

          +--------------+  <<uses>>  +-------------------------+
          | Media Source |- - - - - ->| Synchronization Context |
          +--------------+            +-------------------------+
                < > 1..*
                 |
                 | 0..*
          +--------------+
          |              |<>-+ 0..*
          |    Media     |   |
          |   Provider   |   |
          |              |---+ 0..*
          +--------------+
                < > 1
                 |
                 | 0..*
          +----------------+ 0..*     1 +-------------+
          |  Media Stream  |----------<>| RTP Session |
          +----------------+            +-------------+

                     Figure 1: Media Source Relations

   Media sources can have a large variety of relationships among them.
   These relationships can apply both between sources within a single
   RTP Session, and between Media Sources that occur in multiple RTP
   Session.  Ways of relating them typically involve groups: a set of
   Media Sources has some relationship that applies to all those in the
   group, and no others.  (Relationships that involve arbitrary non-
   grouping associations among Media sources, such that e.g., A relates
   to B and B to C, but A and C are unrelated, are uncommon if not
   nonexistent.)  In many cases, the semantics of groups are not simply
   that the the members form an undifferentiated group, but rather that
   members of the group have certain roles.

3.1.  Synchronization Context

   A synchronization context defines requirement on a strong timing
   relationship between the related entities, typically requiring
   alignment of clock sources.  Such relationship can be identified in
   multiple ways as listed below.  A single Media Source can only belong

Lennox, et al.          Expires January 16, 2014               [Page 11]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   to a single Synchronization Context, since it is assumed that a
   single Media Source can only have a single media clock and requiring
   alignment to several Synchronization Contexts will effectively merge
   those into a single Synchronization Context.

   A single Multimedia session can contain media from one or more
   Synchronization Contexts.  An example of that is a Multimedia Session
   containing one set of audio and video for communication purposes
   belonging to one Synchronization context, and another set of audio
   and video for presentation purposes (like playing a video file) that
   has no strong timing relationship and need not be strictly
   synchronized with the audio and video used for communication.

3.1.1.  RTCP CNAME

   RFC3550 [RFC3550] describes Inter-media synchronization between RTP
   Sessions based on RTCP CNAME, RTP and Network Time Protocol (NTP)
   [RFC5905] timestamps.

3.1.2.  Clock Source Signaling

   [I-D.ietf-avtcore-clksrc] provides a mechanism to signal the clock
   source in SDP, thus allowing a synchronized context to be defined.

3.1.3.  CLUE Scenes

   In CLUE "Capture Scene", "Capture Scene Entry" and "Captures" define
   an implied synchronization context.

3.1.4.  Implicitly via RtcMediaStream

   The WebRTC WG defines "RtcMediaStream" with one or more
   "RtcMediaStreamTracks".  All tracks in a "RTCMediaStream" are
   intended to be synchronized when rendered.

3.1.5.  Explicitly via SDP Mechanisms

   RFC5888 [RFC5888] defines m=line grouping mechanism called "Lip
   Synchronization (LS)" for establishing the synchronization
   requirement across m=lines when they map to individual sources.

   RFC5576 [RFC5576] extends the above mechanism when multiple media
   sources are described by a single m=line.

3.2.  Containment Context

   A containment relationship allows composing of multiple concepts into
   a larger concept.

Lennox, et al.          Expires January 16, 2014               [Page 12]
Internet-Draft            RTP Grouping Taxonomy                July 2013

3.2.1.  Media Stream Multiplexing

   Multiple Media Streams can be contained within a single RTP Session
   via unique SSRC per Media Stream.
   [I-D.ietf-mmusic-sdp-bundle-negotiation] provides SDP based signaling
   mechanism to enable this across several m=lines.

   RFC5576 [RFC5576] enables the same for multiple Media Sources
   described in a single m=line.

3.2.2.  RTP Session Multiplexing

   [I-D.westerlund-avtcore-transport-multiplexing], for example,
   describes a mechanism that allow several RTP Sessions to be carried
   over a single underlying Media Transport.

3.2.3.  Multiple Media Sources in a WebRTC PeerConnection

   The WebRTC WG defines a containment object named "RTCPeerConnection"
   that can potentially contain several Media Sources mapped to a single
   RTP Session or spread across several RTP Sessions.

3.3.  Equivalence Context

   In this relationship different instances of a concept are treated to
   be equivalent for the purposes of relating them to the Media Source.

   Figure 2 below depicts in UML notation the general relation between a
   Media Provider and its Media Stream(s), including the Media Stream
   specializations Source Stream and RTP Repair Stream.

                      +--------------+
                      |              |<>-+ 0..*
                      |    Media     |   |
                      |   Provider   |   |
                      |              |---+ 0..*
                      +--------------+
                            < > 1
                             |
                             | 0..*
                      +--------------+ 0..*  1 +-----------------+
                      | Media Stream |<>-------| Media Transport |
                      +--------------+         +-----------------+
                        /\        /\
                       +--+      +--+
                        |          |
                +-------+          +-------+
                |                          |

Lennox, et al.          Expires January 16, 2014               [Page 13]
Internet-Draft            RTP Grouping Taxonomy                July 2013

        +--------------+            +--------------+ 1
        |    Primary   |<>----------|    Repair    |<>-+
        |    Stream    | 1..*  0..* |    Stream    |---+
        +--------------+            +--------------+ 0..*

                     Figure 2: Media Stream Relations

   This relation can in combination with Figure 1 be used to achieve a
   set of functionalities, described below.

3.3.1.  Simulcast

   A Media Source represented as multiple independent Encodings
   constitutes a simulcast of that Media Source.  The figure below
   represents an example of a Media Source that is encoded into three
   separate simulcast streams that are in turn sent on the same
   transport flow.

                            +----------------+
                            |  Media Source  |
                            +----------------+
                            < >    < >    < >
                             |      |      |
                +------------+      |      +--------------+
                |                   |                     |
       +----------------+   +----------------+   +----------------+
       | Media Provider |   | Media Provider |   | Media Provider |
       +----------------+   +----------------+   +----------------+
              < >                  < >                   < >
               |                    |                     |
               |                    |                     |
       +----------------+   +----------------+   +----------------+
       |  Media Stream  |   |  Media Stream  |   |  Media Stream  |
       +----------------+   +----------------+   +----------------+
              < >                  < >                   < >
               |                    |                     |
               +---------------+    |    +----------------+
                               |    |    |
                          +-------------------+
                          |  Media Transport  |
                          +-------------------+

                Figure 3: Example of Media Source Simulcast

3.3.2.  Layered MultiStream Transmission

Lennox, et al.          Expires January 16, 2014               [Page 14]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   Multi-stream transmission (MST) is a mechanism by which different
   portions of a layered encoding of a media stream are sent using
   separate Media Streams (sometimes in separate RTP sessions).  MSTs
   are useful for receiver control of layered media.

   A Media Source represented as multiple dependent Encodings
   constitutes a Media Source that has layered dependency.  The figure
   below represents an example of a Media Source that is encoded into
   three dependent layers, where two layers are sent on the same
   transport flow and the third layer is sent on a separate transport
   flow.

                            +----------------+
                            |  Media Source  |
                            +----------------+
                             < >   < >   < >
                              |     |     |
               +--------------+     |     +--------------+
               |                    |                    |
       +----------------+   +----------------+   +---------------+
       | Media Provider |<>-| Media Provider |<>-| Media Provider|
       +----------------+   +----------------+   +---------------+
              < >                  < >                  < >
               |                    |                    |
               |                    |                    |
       +----------------+   +----------------+   +----------------+
       |  Media Stream  |   |  Media Stream  |   |  Media Stream  |
       +----------------+   +----------------+   +----------------+
              < >                  < >                   < >
               |                    |                     |
               +------+      +------+                     |
                      |      |                            |
                +-----------------+              +-----------------+
                | Media Transport |              | Media Transport |
                +-----------------+              +-----------------+

           Figure 4: Example of Media Source Layered Dependency

3.3.3.  Robustness and Repair

   A Media Source may be protected by repair streams during transport.
   Several approaches listed below can achieve the same result

   o  Duplication of the original Media Stream

   o  Duplication of the original Media Stream with a time offset,

   o  forward error correction (FEC) techniques, and.

Lennox, et al.          Expires January 16, 2014               [Page 15]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   o  retransmission of lost packets (either globally or selectively).

   The figure below represents an example where a Media Source is
   protected by a retransmission (RTX) flow.  In this example the
   primary Media Stream and the RTP RTX Stream share the same Media
   Transport.

                     +----------------+
                     |  Media Source  |
                     +----------------+
                            < >
                             |
                     +----------------+
                     | Media Provider |
                     +----------------+
                            < >
                             |
                     +---------------+   +-----------+
                     | Primary Media |<>-| RTX Media |
                     |    Stream     |   |  Stream   |
                     +---------------+   +-----------+
                            < >               < >
                             |                 |
                             +------+   +------+
                                    |   |
                             +-----------------+
                             | Media Transport |
                             +-----------------+

          Figure 5: Example of Media Source Retransmission Flows

   The figure below represents an example where two Media Sources are
   protected by individual FEC flows as well as one additional FEC flow
   that protects the set of both Media Sources (a FEC group).  There are
   several possible ways to map those Media Streams to one or more Media
   Transport, but that is omitted from the figure for clarity.

     +----------+                                         +----------+
     |   Media  |                                         |  Media   |
     |  Source  |                                         |  Source  |
     +----------+                                         +----------+
         < >                                                  < >
          |                                                    |
     +----------+                                         +----------+
     |  Media   |                                         |  Media   |
     | Provider |                                         | Provider |
     +----------+                                         +----------+
         < >  +-------------------+    +-------------------+  < >

Lennox, et al.          Expires January 16, 2014               [Page 16]
Internet-Draft            RTP Grouping Taxonomy                July 2013

          |   |                   |    |                   |   |
          |   |                  < >  < >                  |   |
     +---------+   +--------+   +--------+   +--------+   +---------+
     | Primary |   |  RTP   |   |  RTP   |   |  RTP   |   | Primary |
     |  Media  |<>-|  FEC   |-<>|  FEC   |<>-|  FEC   |-<>|  Media  |
     | Stream  |   | Stream |   | Stream |   | Stream |   |  Stream |
     +---------+   +--------+   +--------+   +--------+   +---------+

                Figure 6: Example of Media Source FEC Flows

3.3.4.  SDP FID Semantics

   RFC5888 [RFC5888] defines m=line grouping mechanism called "FID" for
   establishing the equivalence of Media Streams across the m=lines
   under grouping.

   RFC5576 [RFC5576] extends the above mechanism when multiple media
   sources are described by a single m=line.

3.4.  Session Context

   There are different ways to construct a Communication Session.  The
   general relation in UML notation between a Communication Session,
   Participants, Multimedia Sessions and RTP Sessions is outlined below.

Lennox, et al.          Expires January 16, 2014               [Page 17]
Internet-Draft            RTP Grouping Taxonomy                July 2013

                            +---------------+
                            | Communication |
                            |    Session    |
                            +---------------+
                         0..* < >       < > 1..*
                               |         |
                    +----------+         +--------+
               1..* |                             | 1..*
             +-------------+ 1     0..* +--------------------+
             | Participant |<>----------| Multimedia Session |
             +-------------+            +--------------------+
                   < > 1                         < > 1
                    |                             | 0..*
                    |                      +-------------+
                    |                      | RTP Session |
                    |                      +-------------+
                    |                            < > 1
                    | 0..*                        | 0..*
             +-----------------+ 1   0..* +--------------+
             | Media Transport |--------<>| Media Stream |
             +-----------------+          +--------------+

                        Figure 7: Session Relations

   Several different flavors of Session can be possible.  A few typical
   examples are listed in the below sub-sections, but many other are
   possible to construct.

3.4.1.  Point-to-Point Session

   In this example, a single Multimedia Session is shared between the
   two Participants.  That Multimedia Session contains a single RTP
   Session with two Media Streams from each Participant.  Each
   Participant has only a single Media Transport, carrying those Media
   Streams, which is the main reason why there is only a single RTP
   Session.

                                 +----------------+
                                 | Point-to-Point |
                                 |    Session     |
                                 +----------------+
                                  < >   < >   < >
                                   |     |     |
          +------------------------+     |     +------------------------+
          |                              |                              |
   +-------------+            +--------------------+            +-------------+
   | Participant |<>----------| Multimedia Session |----------<>| Participant |

Lennox, et al.          Expires January 16, 2014               [Page 18]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   +-------------+            +--------------------+            +-------------+
         < >                            < >                            < >
          |                              |                              |
          | +--------------+      +-------------+      +--------------+ |
          | | Media Stream |----<>| RTP Session |<>----| Media Stream | |
          | +--------------+      +-------------+      +--------------+ |
          |     < >                 < >     < >                 < >     |
          |      |                   |       |                   |      |
   +-----------------+   +--------------+ +--------------+   +-----------------+
   | Media Transport |-<>| Media Stream | | Media Stream |<>-| Media Transport |
   +-----------------+   +--------------+ +--------------+   +-----------------+

                 Figure 8: Example Point-to-Point Session

3.4.2.  Full Mesh Session

   In this example, the Full Mesh Session has three Participants, each
   of which has the same characteristics as the example in the previous
   section; a single Media Transport per peer Participant, resulting in
   a single RTP session between each pair of Participants.

   +-----------+                  +-------------+                 +-----------+
   |   Media   |----------------<>| Participant |<>---------------|   Media   |
   | Transport |                  +-------------+                 | Transport |
   +-----------+                         |                        +-----------+
       |      |         +------------+   |   +------------+         |      |
      < >    < >        | Multimedia |   |   | Multimedia |        < >    < >
   +--------++--------+ |  Session   |   |   |  Session   | +--------++--------+
   | Media  || Media  | +------------+   |   +------------+ | Media  || Media  |
   | Stream || Stream |  < >        |    |    |        < >  | Stream || Stream |
   +--------++--------+   |         |    |    |         |   +--------++--------+
       |           |      |         |    |    |         |      |          |
       |          < >     |        < >  < >  < >        |     < >         |
       |         +---------+     +---------------+     +---------+        |
       +-------<>|   RTP   |     |   Full Mesh   |     |   RTP   |<>------+
       +-------<>| Session |     |    Session    |     | Session |<>------+
       |         +---------+     +---------------+     +---------+        |
       |          < >             < >   < >   < >             < >         |
       |           |               |     |     |               |          |
   +--------++--------+            |     |     |            +--------++--------+
   | Media  || Media  |            |     |     |            | Media  || Media  |
   | Stream || Stream |            |     |     |            | Stream || Stream |
   +--------++--------+            |     |     |            +--------++--------+
      < >    < >                   |     |     |                   < >    < >
       |      |                    |     |     |                    |      |
   +-----------+                   |     |     |                   +-----------+
   |   Media   |                   |     |     |                   |   Media   |

Lennox, et al.          Expires January 16, 2014               [Page 19]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   | Transport |                   |     |     |                   | Transport |
   +-----------+ +-----------------+     |     +-----------------+ +-----------+
                 |                       |                       |
   +-------------+             +--------------------+            +-------------+
   | Participant |<>-----------| Multimedia Session |----------<>| Participant |
   +-------------+             +--------------------+            +-------------+
         < >                            < >                            < >
          |                              |                              |
          |   +--------+            +---------+            +--------+   |
          |   | Media  |----------<>|   RTP   |<>----------| Media  |   |
          |   | Stream |            | Session |            | Stream |   |
          |   +--------+            +---------+            +--------+   |
          |    < >                  < >     < >                 < >     |
          |     |                    |       |                   |      |
       +-----------+           +--------+ +--------+           +-----------+
       |   Media   |---------<>| Media  | | Media  |<>---------|   Media   |
       | Transport |           | Stream | | Stream |           | Transport |
       +-----------+           +--------+ +--------+           +-----------+

                    Figure 9: Example Full Mesh Session

3.4.3.  Centralized Conference Session

   Text to be provided

                                    TBD

             Figure 10: Example Centralized Conference Session

4.  Security Considerations

   This document simply tries to clarify the confusion prevalent in RTP
   taxonomy because of inconsistent usage by multiple technologies and
   protocols making use of the RTP protocol.  It does not introduce any
   new security considerations beyond those already well documented in
   the RTP protocol [RFC3550] and each of the many respective
   specifications of the various protocols making use of it.

   Hopefully having a well-defined common terminology and understanding
   of the complexities of the RTP architecture will help lead us to
   better standards, avoiding security problems.

Lennox, et al.          Expires January 16, 2014               [Page 20]
Internet-Draft            RTP Grouping Taxonomy                July 2013

5.  Acknowledgement

   This document has many concepts borrowed from several documents such
   as WebRTC [I-D.ietf-rtcweb-overview], CLUE [I-D.ietf-clue-framework],
   Multiplexing Architecture
   [I-D.westerlund-avtcore-transport-multiplexing].  The authors would
   like to thank all the authors of each of those documents.

   The authors would also like to acknowledge the insights, guidance and
   contributions of Magnus Westerlund, Roni Even, Colin Perkins, Keith
   Drage, and Harald Alvestrand.

6.  Open Issues

   Much of the terminology is still a matter of dispute.

   It might be useful to distinguish between a single endpoint's view of
   a source, or RTP session, or multimedia session, versus the full set
   of sessions and every endpoint that's communicating in them, with the
   signaling that established them.

   (Sure to be many more...)

7.  IANA Considerations

   This document makes no request of IANA.

8.  References

8.1.  Normative References

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

   [UML]      Object Management Group, "OMG Unified Modeling Language
              (OMG UML), Superstructure, V2.2", OMG formal/2009-02-02,
              February 2009.

8.2.  Informative References

   [I-D.ietf-avtcore-clksrc]
              Williams, A., Gross, K., Brandenburg, R., and H. Stokking,
              "RTP Clock Source Signalling", draft-ietf-avtcore-
              clksrc-05 (work in progress), July 2013.

   [I-D.ietf-clue-framework]

Lennox, et al.          Expires January 16, 2014               [Page 21]
Internet-Draft            RTP Grouping Taxonomy                July 2013

              Duckworth, M., Pepperell, A., and S. Wenger, "Framework
              for Telepresence Multi-Streams", draft-ietf-clue-
              framework-11 (work in progress), July 2013.

   [I-D.ietf-mmusic-sdp-bundle-negotiation]
              Holmberg, C., Alvestrand, H., and C. Jennings,
              "Multiplexing Negotiation Using Session Description
              Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp-
              bundle-negotiation-04 (work in progress), June 2013.

   [I-D.ietf-rtcweb-overview]
              Alvestrand, H., "Overview: Real Time Protocols for Brower-
              based Applications", draft-ietf-rtcweb-overview-06 (work
              in progress), February 2013.

   [I-D.westerlund-avtcore-transport-multiplexing]
              Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a
              Single Lower-Layer Transport", draft-westerlund-avtcore-
              transport-multiplexing-05 (work in progress), February
              2013.

   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
              with Session Description Protocol (SDP)", RFC 3264, June
              2002.

   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
              Description Protocol", RFC 4566, July 2006.

   [RFC5576]  Lennox, J., Ott, J., and T. Schierl, "Source-Specific
              Media Attributes in the Session Description Protocol
              (SDP)", RFC 5576, June 2009.

   [RFC5888]  Camarillo, G. and H. Schulzrinne, "The Session Description
              Protocol (SDP) Grouping Framework", RFC 5888, June 2010.

   [RFC5905]  Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
              Time Protocol Version 4: Protocol and Algorithms
              Specification", RFC 5905, June 2010.

   [RFC6222]  Begen, A., Perkins, C., and D. Wing, "Guidelines for
              Choosing RTP Control Protocol (RTCP) Canonical Names
              (CNAMEs)", RFC 6222, April 2011.

Lennox, et al.          Expires January 16, 2014               [Page 22]
Internet-Draft            RTP Grouping Taxonomy                July 2013

Appendix A.  Changes From Earlier Versions

   NOTE TO RFC EDITOR: Please remove this section prior to publication.

A.1.  Changes From Draft -00

   o  Too many to list

   o  Added new authors

   o  Updated content organization and presentation

Authors' Addresses

   Jonathan Lennox
   Vidyo, Inc.
   433 Hackensack Avenue
   Seventh Floor
   Hackensack, NJ  07601
   US

   Email: jonathan@vidyo.com

   Kevin Gross
   AVA Networks, LLC
   Boulder, CO
   US

   Email: kevin.gross@avanw.com

   Suhas Nandakumar
   Cisco Systems
   170 West Tasman Drive
   San Jose, CA  95134
   US

   Email: snandaku@cisco.com

   Gonzalo Salgueiro
   Cisco Systems
   7200-12 Kit Creek Road
   Research Triangle Park, NC  27709
   US

   Email: gsalguei@cisco.com

Lennox, et al.          Expires January 16, 2014               [Page 23]
Internet-Draft            RTP Grouping Taxonomy                July 2013

   Bo Burman
   Ericsson
   Farogatan 6
   SE-164 80 Kista
   Sweden

   Phone: +46 10 714 13 11
   Email: bo.burman@ericsson.com

Lennox, et al.          Expires January 16, 2014               [Page 24]