A More Loss-Tolerant RTP Payload Format for MP3 Audio
RFC 3119
Document | Type |
RFC - Proposed Standard
(June 2001; Errata)
Obsoleted by RFC 5219
Was draft-ietf-avt-rtp-mp3 (avt WG)
|
|
---|---|---|---|
Author | Ross Finlayson | ||
Last updated | 2020-01-21 | ||
Stream | IETF | ||
Formats | plain text html pdf htmlized with errata bibtex | ||
Stream | WG state | (None) | |
Document shepherd | No shepherd assigned | ||
IESG | IESG state | RFC 3119 (Proposed Standard) | |
Consensus Boilerplate | Unknown | ||
Telechat date | |||
Responsible AD | (None) | ||
Send notices to | (None) |
Network Working Group R. Finlayson Request for Comments: 3119 LIVE.COM Category: Standards Track June 2001 A More Loss-Tolerant RTP Payload Format for MP3 Audio Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Abstract This document describes a RTP (Real-Time Protocol) payload format for transporting MPEG (Moving Picture Experts Group) 1 or 2, layer III audio (commonly known as "MP3"). This format is an alternative to that described in RFC 2250, and performs better if there is packet loss. 1. Introduction While the RTP payload format defined in RFC 2250 [2] is generally applicable to all forms of MPEG audio or video, it is sub-optimal for MPEG 1 or 2, layer III audio (commonly known as "MP3"). The reason for this is that an MP3 frame is not a true "Application Data Unit" - it contains a back-pointer to data in earlier frames, and so cannot be decoded independently of these earlier frames. Because RFC 2250 defines that packet boundaries coincide with frame boundaries, it handles packet loss inefficiently when carrying MP3 data. The loss of an MP3 frame will render some data in previous (or future) frames useless, even if they are received without loss. In this document we define an alternative RTP payload format for MP3 audio. This format uses a data-preserving rearrangement of the original MPEG frames, so that packet boundaries now coincide with true MP3 "Application Data Units", which can also (optionally) be rearranged in an interleaving pattern. This new format is therefore more data-efficient than RFC 2250 in the face of packet loss. Finlayson Standards Track [Page 1] RFC 3119 Loss-Tolerant RTP Payload Format for MP3 Audio June 2001 2. The Structure of MP3 Frames In this section we give a brief overview of the structure of a MP3 frame. (For more detailed description, see the MPEG 1 audio [3] and MPEG 2 audio [4] specifications.) Each MPEG audio frame begins with a 4-byte header. Information defined by this header includes: - Whether the audio is MPEG 1 or MPEG 2. - Whether the audio is layer I, II, or III. (The remainder of this document assumes layer III, i.e., "MP3" frames) - Whether the audio is mono or stereo. - Whether or not there is a 2-byte CRC field following the header. - (indirectly) The size of the frame. The following structures appear after the header: - (optionally) A 2-byte CRC field - A "side info" structure. This has the following length: - 32 bytes for MPEG 1 stereo - 17 bytes for MPEG 1 mono, or for MPEG 2 stereo - 9 bytes for MPEG 2 mono - Encoded audio data, plus optional ancillary data (filling out the rest of the frame) For the purpose of this document, the "side info" structure is the most important, because it defines the location and size of the "Application Data Unit" (ADU) that an MP3 decoder will process. In particular, the "side info" structure defines: - "main_data_begin": This is a back-pointer (in bytes) to the start of the ADU. The back-pointer is counted from the beginning of the frame, and counts only encoded audio data and any ancillary data (i.e., ignoring any header, CRC, or "side info" fields). An MP3 decoder processes each ADU independently. The ADUs will generally vary in length, but their average length will, of course, be that of the of the MP3 frames (minus the length of the header, CRC, and "side info" fields). (In MPEG literature, this ADU is sometimes referred to as a "bit reservoir".) Finlayson Standards Track [Page 2] RFC 3119 Loss-Tolerant RTP Payload Format for MP3 Audio June 2001 3. A New Payload Format As noted in [5], a payload format should be designed so that packet boundaries coincide with "codec frame boundaries" - i.e., with ADUs. In the RFC 2250 payload format for MPEG audio [2], each RTP packet payload contains MP3 frames. In this new payload format for MP3 audio, however, each RTP packet payload contains "ADU frames", each preceded by an "ADU descriptor". 3.1 ADU frames An "ADU frame" is defined as: - The 4-byte MPEG header (the same as the original MP3 frame, except that the first 11 bits are (optionally) replaced by an "Interleaving SequenceShow full document text