RTP Payload Format for AC-3 Audio
RFC 4184

Document Type RFC - Proposed Standard (October 2005; No errata)
Authors Jason Flaks  , Todd Hager  , Brian Link 
Last updated 2013-03-02
Replaces draft-flaks-avt-rtp-ac3
Stream IETF
Formats plain text html pdf htmlized bibtex
Stream WG state (None)
Document shepherd No shepherd assigned
IESG IESG state RFC 4184 (Proposed Standard)
Consensus Boilerplate Unknown
Telechat date
Responsible AD Allison Mankin
Send notices to csp@csperkins.org, magnus.westerlund@ericsson.com
Network Working Group                                            B. Link
Request for Comments: 4184                                      T. Hager
Category: Standards Track                             Dolby Laboratories
                                                                J. Flaks
                                                   Microsoft Corporation
                                                            October 2005

                   RTP Payload Format for AC-3 Audio

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2005).


   This document describes an RTP payload format for transporting audio
   data using the AC-3 audio compression standard.  AC-3 is a high
   quality, multichannel audio coding system that is used for United
   States HDTV, DVD, cable television, satellite television and other
   media.  The RTP payload format presented in this document includes
   support for data fragmentation.

1.  Introduction

   AC-3 [ATSC] is a high-quality audio codec (audio coding format)
   designed to encode multiple channels of audio into a low bit-rate
   format.  AC-3 achieves its large compression ratios via encoding a
   multiplicity of channels as a single entity.  Dolby Digital, which is
   a branded version of AC-3, encodes up to 5.1 channels of audio.

   AC-3 has been adopted as an audio compression scheme for many
   consumer and professional applications.  It is a mandatory audio
   codec for DVD-video, Advanced Television Standards Committee (ATSC)
   digital terrestrial television and Digital Living Network Alliance
   (DLNA) home networking, as well as an optional multichannel audio
   format for DVD-audio.

   There is a need to stream AC-3 data over IP networks.  The Internet
   Real Time Protocol (RTP) provides a mechanism for stream

Link, et al.                Standards Track                     [Page 1]
RFC 4184                  RTP Payload for AC-3              October 2005

   synchronization and hence serves as the best transport solution for
   AC-3, which is primarily used in audio-for-video applications.
   Applications for streaming AC-3 include streaming movies from a home
   media server to a display, video on demand, and multichannel Internet

   Section 2 gives a brief overview of the AC-3 algorithm.  Section 3
   specifies values for fields in the RTP header, while Section 4
   specifies the AC-3 payload format.  Section 5 discusses media types
   and SDP usage.  Security considerations are covered in Section 6,
   congestion control in Section 7, and IANA considerations in Section
   8.  References are given in Sections 9 and 10.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [RFC2119].

2.  Overview of AC-3

   AC-3 can deliver up to 5.1 channels of audio at data rates
   approximately equal to half of one PCM channel [ATSC], [1994AC3],
   [1996AC3].  The ".1" refers to a band-limited, optional, low-
   frequency effects (LFE) channel.  AC-3 was designed for signals
   sampled at rates of 32, 44.1, or 48 kHz.  Data rates can vary between
   32 kbps and 640 kbps, depending on the number of channels and the
   desired quality.

   AC-3 exploits psycho-acoustic phenomena that cause a significant
   fraction of the information contained in a typical audio signal to be
   inaudible.  Substantial data reduction occurs via the removal of
   inaudible information contained in an audio stream.  Source coding
   techniques are further used to reduce the data rate.

   Like most perceptual coders, AC-3 operates in the frequency domain.
   A 512-point TDAC transform is taken with 50% overlap, providing 256
   new frequency samples.  Frequency samples are then converted to
   exponents and mantissas.  Exponents are differentially encoded.
   Mantissas are allocated a varying number of bits depending on the
   audibility of the associated spectral components.  Audibility is
   determined via a masking curve.  Bits for mantissas are allocated
   from a global bit pool.

2.1.  AC-3 Bit Stream

   AC-3 bit streams are organized into synchronization frames.  Each
   AC-3 frame contains a Synchronization Information (SI) field, a Bit
   Stream Information (BSI) field, and 6 audio blocks (ABs) that each
   represent 256 PCM samples for all channels.  The frame ends with an

Link, et al.                Standards Track                     [Page 2]
RFC 4184                  RTP Payload for AC-3              October 2005
Show full document text