Ambisonics in an Ogg Opus Container
RFC 8486
Document | Type |
RFC
- Proposed Standard
(October 2018)
Errata
Updates RFC 7845
|
|
---|---|---|---|
Authors | Jan Skoglund , Michael Graczyk | ||
Last updated | 2023-03-24 | ||
RFC stream | Internet Engineering Task Force (IETF) | ||
Formats | |||
Additional resources | Mailing list discussion | ||
IESG | Responsible AD | Ben Campbell | |
Send notices to | (None) |
RFC 8486
amp; Graczyk Standards Track [Page 4] RFC 8486 Opus Ambisonics October 2018 The fields in the channel mapping table have the following meaning: 1. Stream Count "N" (8 bits, unsigned): This is the total number of streams encoded in each Ogg packet. 2. Coupled Stream Count "M" (8 bits, unsigned): This is the number of the N streams whose decoders are to be configured to produce two channels (stereo). 3. Demixing Matrix (16*K*C bits, signed): The coefficients of the demixing matrix stored in column-major order as 16-bit, signed, two's complement fixed-point values with 15 fractional bits (Q15), little endian. If needed, the output gain field can be used for a normalization scale. For mixed- order Ambisonic representations, the silent ACN channels are indicated by all zeros in the corresponding rows of the mixing matrix. This also allows for mixed order with non-diegetic stereo as the number of columns implies the presence of non- diegetic channels. Note that [RFC7845] specifies that the identification header cannot exceed one "page", which is 65,025 octets. This limits the Ambisonic order, which then MUST be lower than 12, if full order is utilized and the number of coded streams is the same as the Ambisonic order plus the two non-diegetic channels. The total output channel number, C, MUST be set in the third field of the identification header. 3.3. Allowed Numbers of Channels For both channel mapping families 2 and 3, the allowed numbers of channels are (1 + n)^2 + 2j for n = 0, 1, ..., 14 and j = 0 or 1, where n denotes the (highest) Ambisonic order and j denotes whether or not there is a separate non-diegetic stereo stream. This corresponds to periphonic Ambisonics from zeroth to fourteenth order plus potentially two channels of non-diegetic stereo. Explicitly, the allowed number of channels are 1, 3, 4, 6, 9, 11, 16, 18, 25, 27, 36, 38, 49, 51, 64, 66, 81, 83, 100, 102, 121, 123, 144, 146, 169, 171, 196, 198, 225, and 227. Note again that if full Ambisonic order is used and the number of coded streams is the same as the Ambisonic order plus the two non-diegetic channels, the order must then be lower than 12, due to the identification header length limit. Skoglund & Graczyk Standards Track [Page 5] RFC 8486 Opus Ambisonics October 2018 4. Downmixing The downmixing matrices in this section are only examples known to give acceptable results for stereo downmixing from Ambisonics, but other mixing strategies will be allowed, e.g., to emphasize a certain panning. An Ogg Opus player MAY use the matrix in Figure 5 to implement downmixing from multichannel files using channel mapping families 2 and 3 when there is no non-diegetic stereo. The first and second Ambisonic channels are known as "W" and "Y", respectively. The omitted coefficients in the matrix in the figure have the value 0.0. / \ / \ / \ | L | | 0.5 0.5 0.0 ... | | W | | R | = | 0.5 -0.5 0.0 ... | | Y | \ / \ / | ... | \ / Figure 5: Stereo Downmixing Matrix for Channel Mapping Families 2 and 3 - Only Ambisonic Channels The first Ambisonic channel (W) is a mono audio stream that represents the average audio signal over all directions. Since W is not directional, Ogg Opus players MAY use W directly for mono playback. If a non-diegetic stereo track is present, the player MAY use the matrix in Figure 6 for downmixing. Ls and Rs denote the two non- diegetic stereo channels. / \ / \ / \ | L | | 0.25 0.25 0.0 ... 0.5 0.0 | | W | | R | = | 0.25 -0.25 0.0 ... 0.0 0.5 | | Y | \ / \ / | ... | | Ls | | Rs | \ / Figure 6: Stereo Downmixing Matrix for Channel Mapping Families 2 and 3 - Ambisonic Channels Plus a Non-Diegetic Stereo Stream Skoglund & Graczyk Standards Track [Page 6] RFC 8486 Opus Ambisonics October 2018 5. Updates to RFC 7845 5.1. Format of the Channel Mapping Table The language in Section 5.1.1 of [RFC7845] (copied below) implies that the channel mapping table, when present, has a fixed format for all channel mapping families: The order and meaning of these channels are defined by a channel mapping, which consists of the 'channel mapping family' octet and, for channel mapping families other than family 0, a 'channel mapping table', as illustrated in Figure 3. This document updates [RFC7845] to clarify that the format of the channel mapping table may depend on the channel mapping family: The order and meaning of these channels are defined by a channel mapping, which consists of the 'channel mapping family' octet and for channel mapping families other than family 0, a 'channel mapping table'. The format of the channel mapping table depends on the channel mapping family. Unless the channel mapping family requires a custom format for its channel mapping table, the RECOMMENDED channel mapping table format for new mapping families is illustrated in Figure 3. The change above is not meant to change how families 1 and 255 currently work. To ensure that, the first paragraph of Section 5.1.1.2 is changed from: Allowed numbers of channels: 1...8. Vorbis channel order (see below). to: Allowed numbers of channels: 1...8, with the mapping specified according to Figure 3. Vorbis channel order (see below). Similarly, the first paragraph of Section 5.1.1.3 is changed from: Allowed numbers of channels: 1...255. No defined channel meaning. to: Allowed numbers of channels: 1...255, with the mapping specified according to Figure 3. No defined channel meaning. Skoglund & Graczyk Standards Track [Page 7] RFC 8486 Opus Ambisonics October 2018 5.2. Unknown Mapping Families The treatment of unknown mapping families is changed slightly. Section 5.1.1.4 of [RFC7845] states: The remaining channel mapping families (2...254) are reserved. A demuxer implementation encountering a reserved 'channel mapping family' value SHOULD act as though the value is 255. This is changed to: The remaining channel mapping families (2...254) are reserved. A demuxer implementation encountering a 'channel mapping family' value that it does not recognize SHOULD NOT attempt to decode the packets and SHOULD NOT use any information except for the first 19 octets of the ID header packet (Figure 2) and the comment header (Figure 10). 6. Experimental Mapping Families To make development of new mapping families easier while reducing the risk of creating compatibility issues with non-final versions of mapping families, mapping families 240 through 254 (inclusively) are now reserved for experiments and implementations of in-development families. Note that these mapping-family experiments are not restricted to Ambisonics. Implementers SHOULD attempt to use experimental family numbers that have not recently been used and SHOULD advertise what experimental numbers they use (e.g., for Internet-Drafts). The Ambisonics mapping experiments that led to this document used experimental family 254 for family 2 and experimental family 253 for family 3. 7. Security Considerations Implementations of the Ogg container need to take appropriate security considerations into account, as outlined in Section 8 of [RFC7845]. The extension defined in this document requires that semantic meaning be assigned to more channels than the existing Ogg format requires. Since more allocations will be required to encode and decode these semantically meaningful channels, care should be taken in any new allocation paths. Implementations MUST NOT overrun their allocated memory nor read from uninitialized memory when managing the Ambisonic channel mapping. Skoglund & Graczyk Standards Track [Page 8] RFC 8486 Opus Ambisonics October 2018 8. IANA Considerations IANA has added 17 new assignments to the "Opus Channel Mapping Families^?a registry. +---------+----------------------+----------------------------------+ | Value | Description | Reference | +---------+----------------------+----------------------------------+ | 0 | Mono, L/R stereo | Section 5.1.1.1 of [RFC7845], | | | | Section 5 of this document | | | | | | 1 | 1-8 channel surround | Section 5.1.1.2 of [RFC7845], | | | | Section 5 of this document | | | | | | 2 | Ambisonics as | Section 3.1 of this document | | | individual channels | | | | | | | 3 | Ambisonics with | Section 3.2 of this document | | | demixing matrix | | | | | | | 240-254 | Experimental use | Section 6 of this document | | | | | | 255 | Discrete channels | Section 5.1.1.3 of [RFC7845], | | | | Section 5 of this document | +---------+----------------------+----------------------------------+ 9. References 9.1. Normative References [ambix] Nachbar, C., Zotter, F., Deleflie, E., and A. Sontacchi, "AMBIX - A SUGGESTED AMBISONICS FORMAT", Ambisonics Symposium, June 2011, <http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf>. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, September 2012, <https://www.rfc-editor.org/info/rfc6716>. [RFC7845] Terriberry, T., Lee, R., and R. Giles, "Ogg Encapsulation for the Opus Audio Codec", RFC 7845, DOI 10.17487/RFC7845, April 2016, <https://www.rfc-editor.org/info/rfc7845>. Skoglund & Graczyk Standards Track [Page 9] RFC 8486 Opus Ambisonics October 2018 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. 9.2. Informative References [daniel04] Daniel, J. and S. Moreau, "Further Study of Sound Field Coding with Higher Order Ambisonics", Audio Engineering Society Convention Paper, May 2004, <https://www.researchgate.net/publication/ 277841868_Further_Study_of_Sound_Field_Coding _with_Higher_Order_Ambisonics>. [fellgett75] Fellgett, P., "Ambisonics. Part one: General system description", Studio Sound vol. 17, no. 8, pp. 20-22, August 1975, <http://www.michaelgerzonphotos.org.uk/articles/ Ambisonics%201.pdf>. Acknowledgments Thanks to Timothy Terriberry, Jean-Marc Valin, Mark Harris, Marcin Gorzel, and Andrew Allen for their guidance and valuable contributions to this document. Authors' Addresses Jan Skoglund Google LLC 345 Spear Street San Francisco, CA 94105 United States of America Email: jks@google.com Michael Graczyk Email: michael@mgraczyk.com Skoglund & Graczyk Standards Track [Page 10]