Xbox One (Durango): Sound of Tomorrow


One of the few components that remain unveiled in Xbox One (Durango) is the sound block. This article is intended to describe this important part inside the system.

Xbox One (Durango) audio architecture seeks a balance between the successes and tradeoffs of previous generation platforms while anticipating the increasing technical needs of next-generation implementations. It provides hardware-accelerated pathways for the most common aspects of audio rendering—compression, mixing, filtering, and so on—on a large number of concurrent voices. The architecture also provides a shared resource model for software processing consumption, allowing each individual title to select what and how much custom signal manipulation to apply in CPU utilization.

Audio Architectural Overview

In addition to general CPU power (which can be used for decoding, synthesis, rendering, and so on), Durango provides several hardware components dedicated to audio processing. The audio hardware components can address the entire unified memory space. Here you can see the hardware-accelerated audio components:

audio

The SHAPE (Scalable Hardware Audio Processing Engine) block comprises the majority of audio functionality, although the other processors also contribute significant features.

SHAPE (Scalable Hardware Audio Processing Engine)

The core hardware dedicated to audio processing is SHAPE. It is designed to perform many of the basic operations commonly required on a per-voice basis. This hardware allows a developer to reduce CPU impacteven for high polyphony and complex-signal routingsand still provide the flexibility of SHAPE/CPU data interchange if a title chooses to perform custom digital signal processing, analysis, or software synthesis.

SHAPE operates on blocks of 128 samples, where each sample supports 24-bit integer resolution (or 32-bit float when used by the CPU). At 48 kHz, this represents a 2.67 ms audio frame, providing increased timing resolution and decreased latency compared with the Xbox 360 256 sample block size. SHAPE offers six fixed function blocks focused on common audio tasks:

1.       XMA Decoder: Concurrent decodes of 512 XMA format voices. XMA is a perceptual codec developed for Xbox 360 offering user-tunable quality and typically providing between 6:1 and 14:1 compression.

2.       SRC: A high-quality dedicated polyphase sample rate conversion block allowing for high performance and high-quality frequency resampling of 512 mono channels of audio data (whether for format conversion, Doppler effect, or pitch variation).

3.       Mix Buffers: Dedicated accumulators for 128 in-place mix channels without needing to access memory, and with additional channels available virtually. These mix buffers also provide coarse metering and clipping detection for debugging and monitoring.

4.       FLT/VOL: A module providing both volume scaling and a state variable filter implementation for more than 2,500 voices/mixes, analogous to the software-exposed XAudio2 per-voice filter available on Windows and Xbox 360. The filter can provide low pass, high pass, band pass, or notch filtering, and exposes Q and cutoff/center frequency parameters. It is used most commonly for distance and occlusion modeling.

5.       EQ/CMP: A module providing up to 512 channels of 3-band equalization and dynamic range compression. The EQ is comprised of three serially cascaded biquad filters. The compressor has a hard-knee response, and supports both side chain and expander functionality.

6.       DMA: SHAPE has dedicated DMA hardware for transferring audio data to and from the unified memory space. This enables scenarios that include transfer without a sample-rate converter, transferring final mix channels, and CPU-based processing in the middle of a SHAPE-based audio graph.

Playback of a typical audio graph is expected to use each of these processors extensively.

ACP (Audio Control Processor)

The ACP provides state management and scheduling of all other audio hardware components on the North Bridge. CPU involvement in intra-frame processing and the synchronization/latency it might introduce is unnecessary.

ASP (Audio Scalar Processor)

The ASP supports scalar float and vector integer operations. Voice chat codecsboth those that manage wireless communication between a voice chat headset and the console, and those that are used to compress/decompress voice data for networked voice communicationare provided in hardware. Additionally, this processor supports xWMA format decompression in hardware; on Xbox 360, xWMA was solely a CPU-side decode option

AVP (Audio Vector Processor)

The AVP supports vector float operations, and is designed primarily for MEC (multichannel echo cancellation) and other noise reduction for the next-generation Kinect audio input. It supports both speech recognition and chat/arbitrary audio input use. MEC and other noise reduction processing allow for a more intelligible stream of the player’s spoken audio data even from a far talk microphone that is typically positioned closer to the output speakers than to the player.

Audio and Durango Hardware

Durango’s audio output pipeline eliminates the DAC (digital-to-analog converter) found in previous generation consoles. All audio is output strictly in the digital realm either through HDMI 1.4a or as S/PDIF optical output. HDMI 1.4a allows for high-fidelity linear 7.1-channel PCM to be transmitted from the console; titles default to an output sampling rate of 48 kHz and a bit depth of 24 bits. Durango is also designed to support up to four simultaneous stereo headset outputs, each of which can represent unique multichannel mixes that are downmixed as required by the output format (for instance, a headset or the S/PDIF output).

Durango accepts audio input from a variety of sources: the next-generation Kinect microphone array, voice chat headsets, other audio input peripherals, and storage media (whether HDD, flash or from cloud storage). Audio also can be algorithmically generated through CPU-based computation and manipulated in real time on a CPU, through the aforementioned SHAPE hardware components, or both.

Compression Formats

Durango offers hardware decompression support for both XMA2 and xWMA, both of which provide significant storage, bandwidth, and memory reductions over uncompressed PCM. XAudio2 also offers software support for ADPCM (Adaptive Differential Pulse Code Modulation). Although the computation for the ADPCM format is low overhead, as a non-perceptual codec ADPCM can express noticeable artifacts at lower sampling rates.


Format

Compression (approximate)


Durango


Xbox 360


Loop capability

PCM

None

Yes

Yes

Arbitrary

ADPCM

3.5-4:1

(Software)

Block aligned

XMA2

6-14:1

Yes

(512 Hardware)

Yes

(320 Hardware)

128 sample-aligned

xWMA

20-40:1

Yes

(Hardware + Software support)

Yes

(Software only)

End to end only, may gap

Additional audio formatsfor instance, MP3 or OGGfor game assets can be provided through title or middleware software codecs running on a CPU.

Audio and the Durango App Model

While in the foreground, an application has full access to the SHAPE hardware. When that application is pushed to the backgroundpinned, picture-in-picture, or other scenariosit relinquishes hardware control. By default, its hardware state is suspended, and resumes when the title returns to the foreground. This also is true for Exclusive Resource Applications [ERAs] where the software graph is suspended.

A title may optionally choose to tear down its audio graph and reconstruct it upon resume. Some titles, particularly Shared Resource Applications [SRAs] that play background music such as streaming radio, may choose to have some aspects of audio continue to play even while paused. For these scenarios, titles should closely evaluate whether to attempt a seamless transition from hardware to software rendering, or to always play audio intended for background playback via a software-only pipeline. This has implications for compression formats and CPU costs. XMA-compressed assets, for example, require the use of SHAPE hardware, and thus will not be decodable for a background application.

The XAudio2 audio engine does provide software pathways for many functions if a title chooses to allocate CPU resources. Where practical, these functions mimic hardware capabilities, but some compute intensive processing is either unavailable or is differently implemented in software. Titles transitioning from hardware processing to software processing based on an app’s state may want to consider these differences when planning their audio pipelines.


Feature

Durango Hardware Capability


Durango Software Capability


Equivalent?

Sample Rate Conversion

(SRC) polyphase

XAudio2 linear interpolation

No

Parametric EQ

(EQ/CMP) 3-band EQ

3-band EQ, simple one-band, or custom DSP

Yes

Compressor/Limiter

(EQ/CMP) Hard-knee, side chain, and expander capabilities

Hard-knee, side chain, and expander capabilities

Yes

Filtering

(FLT/VOL) State variable filter

XAudio2 state variable filter, single-pole LPF, or custom DSP

Yes

Mixing

(Mix Buffers) Includes clip detection, metering

Software mixing; custom DSP for clip detection or metering

Yes (for mixing)

Durango Audio Libraries

Durango supports two audio rendering APIs for typical game use along with a variant of the Windows 8 Media Foundation API for playback of user music:

1.       XAudio2, a game-focused audio library already available on Xbox 360 and Windows operating systems (Windows XP to Windows 8), is generally recommended for most title development.

2.       WASAPI (Windows Audio Session API) can be used for any custom, exclusively software-implemented pipeline. WASAPI provides audio endpoint functionality only. Decompression, sample-rate conversion, mixing, and digital-signal processing, as well as interactions with Durango’s audio hardware components, must be implemented by the client. WASAPI is most typically used by audio middleware solutions.

The Microsoft Cross-Platform Audio Creation Tool (XACT) and DirectSound are not supported in the Durango environment. Titles that previously used these technologies should consider the solutions identified above, or use approved Durango audio middleware options.


  • PLAYSTATION

    It’s all depending on the PS4 hype that our sheeps are working for 24/7, we at Sony gauge their reactions on gaming sites for data mining. The more hype, the better PS4 will be. The banks will lend us enough money to produce the PS4 on the basis of potential ‘crowds’.

    So if you’d like to see the PS4 winning, then go ahead and start typing the usual “Day 1 buy for me” comments to create a perception of high demand. Comments section is our best marketing tool, so why spend tons of money on Ads? We at Sony believe that the sheep squad will voluntarily do the marketing for us like 1.000.000x better in hyping.

    Also, start making more articles with tactical questions (e.g: 399 reasons why the Playstation® 4 will rule the competition) or controlled questions (e.g: Why faster RAM in the PS4 is key to win the next generation war?)

    Sony want the Playstation® to be the focus of attention, to eclipse all competitors.

    SONY
    Make.Believe

    • Thedon82z1

      I am not a technical expert( although I am very well knowleaged on the subject) but this just sounds like more complicated tricks devlopers on the next xbox are going to have to learn, to keep there games from bottlenecking up due to the limited bandwith that the next xbox has because of the ddr3 ram (if the rumors are true)!!!!! Anyway PS4 DAY1 PURCHASE FOR ME!!!!!

  • Sigrid Nilsen

    Looks like VGLeaks have some decent source.

  • John Galt

    What I want to know is if it will support bitstreaming of TrueHD and DTS-HD so that we can finally have non-sucky sound in movies, because if they put it into the xbox it will likely come to Windows 8 since there is 0 cost to bitstreaming them because you’re not decoding.

    • bickle2

      It’s already in Windows in Blu-ray playback programs, and has been pretty much since the format’s inception and since 720 will play Blu-ray it will bitstream it when it’s playing Blu

      If you’re referring to doing it over optical or coax, that’s not going to happen because the transceivers in those connections don’t support it. This document specifically is discussing game audio, and not movie playback. Since it contains HDMI 1.4a, it supports all lossless codecs. Encoding game audio to THD or DTSMA would be a waste of system resources and a huge burden

      • John Galt

        I’m talking about native support built in so that all apps in the store can take advantage. Same as MKV should be natively supported as well.

        • bickle2

          It’s in the hardware. Anyone can write an app to support it. They may need licenses to get the keys to the DRM system that is outside of MS’s control.

          MKV is a piracy fomat, there is no reason for them to support it.

          So in other words, it’s what I suspected initially. You want them to enable you to watch rips of Blu-rays on their console with the same quality as those who paid for it. Watch legitimate Blu-rays on legitimate hardware/software, and you can stream just fine. And if you’re too lazy to change a disc every 2 hours (the most common reason people claim they totally ripped it themselves), you have bigger problems.

          • John Galt

            No, I want them to enable me ripping my own blu rays that I own and playing them back instead of having to pay over and over again for the privilege. As it stands right now there are no file based players that support 7.1 sound and Windows 8 doesn’t support playback at all unless you write custom media foundation splitters and decoders. Given that Windows 7 and legacy apps in Windows 8 all support bitstreaming TrueHD and DTS-HD out of the box, there is no excuse.

            As for MKV being a piracy format, think again. Google’s video standard is just a subset of MKV. It just happens that MKV is a free, open source container format that works very well, better than Apple’s MP4 format by a long shot, and thus is excellent at containing my legal rips of my own blurays so that I don’t have to stick the damn discs in the drive all of the time.

          • bickle2

            You are not charged to play your own Blu-rays. Insert the disc and it plays.

            Access to the hardware to bitstream lossless audio requires a license for the keys to the protected pathways, so yes, you cannot bitstream your ripped movies. I don’t feel bad for pirates at all.

            MKV is a file container, the contents inside yes, share many pieces with other formats. However, only pirates use MKV

            If you live in the United States, there is no such thing as a “legal rip”. The second you bypass DRM it’s not legal.If you are that lazy that you cannot change discs, then please seek professional help. This is not music where you play a song or two and have to change CDs. This is a two-hour experience we’re looking at here. And besides, it costs you multiple dollars of hard drive space to even store a Blu-ray, what a waste

          • John Galt

            Incorrect. The right of fair use allows me to rip, it also allows me to use whatever means necessary to make that rip, and it allows others to disseminate that information under the 1st amendment so long as they do not profit from doing so. (which is why all companies that provide the tools to circumvent the DRM on BluRay and DVD give away the decryption tools). The Supreme court ruled as such in 2009 and the library of congress whom is responsible for administering copyright has reflected as much repeatedly.

            And no, you do not need a licence to bitstream audio because bitstreaming is no different than copying a file. You only need a licence for DECODING said bitstream into audio. (Also already settled in court, as Dolby had their case thrown out against MS for doing so in Windows XP.)

            Further, your statement about MKV is wrong, as I’ve stated, as Google’s Video Codec (VP8) IS MKV and is used thousands of times a second by people all over the world to watch video encoded by google (including streams from youtube served to Firefox and Chrome browsers).

            Thus you’re completely wrong in every respect of the law and use.

            Anyone that legally rips their DVD or Blu Ray would be a fool to choose any format other than MKV because of MP4’S massive limitations. And remember, the only reason why MP4 is used is because apple created it, patented it, makes money off of it and it includes DRM capabilities to protect itunes and other stores like Windows 8’s movie crap.

            Thus it is entirely my right to rip my blu rays at my convenience to make backup copies and to use them whenever I wish so long as I don’t’ distribute them. I exercise this right to alleviate the inconvenience of putting in Blu Rays constantly and waiting 5-10 minutes to get through the load process and all of the crap at the beginning of the movies that I’m otherwise not allowed to skip.

            If you don’t want to do so, that’s up to you, but save your sanctimonious crap for someone else.

            In the meantime, I repeat my request for native MKV media foundation splitters and the ability to bitstream high definition (and even AC-3!) audio to my receiver for decoding. Here’s hoping the next xbox has it, and thus Windows Blue will also likely have it.

          • bickle2

            I’m going to explain this again, very slowly

            1- The Digital Millenium Copyright Act makes illegal the breaking of any encryption to access copyrighted material, even for “fair use”.

            2- The first amendment guaranteees that “Congress shall make no law abridging freedom of speech or the press” Preventing you from ripping your discs in no way violates any law. There is no “non-profit” exemption for free speech. You can tell people how to do it all you want. The second you make a tool to do so, it’s no different than a lockpicking set in the eyes of the law, and you can be sued, since IP is an civil matter.

            3- All companies do not give away the decryption tools. SlySoft and DVDFab both have proprietary means of bypassing security like BD+ they don’t share wit hthe other children. Both sell their products, and at least SlySoft is financed by the Chinese Triads to keep their bootlegging operation going.

            4-What Supreme Court case are you citing “in 2009”?

            5-Yes, you do need a license to bitstream lossless audio, because otherwise you cannot access the protected layer of HDMI. Since no receiver will accept said formats bitstreamed any other way, if you don’t have the keys, you’re boned.

            6- You do not understand digital video formats or their owners. MKV is a container format, not a video codec. MP4 is the video codec. QuickTime is Apple’s container format. WMV/AVI are Microsoft’s. Blu-ray is a container format.

            7- Distribution is immaterial. See DMCA

            8-You will not get your pirate format on MS’s new console, which is highly focused on selling you overpriced movie downloads. In the event that say, VLC is ported to the 720, then they STILL won’t let you bitstream your audio

            You don’t know what you’re talking about.

          • John Galt

            1. Wrong. The digital millenium copyright act makes it a felony for you to share your technique for doing so. It does not make it illegal for you to do it so long as you’re not the one doing the reverse engineering of the encryption yourself.

            2. I never said it did.

            3. http://www.dvdfab.com/hd-decrypter.htm Notice that it’s free to download. Thus you’re wrong.

            4. I don’t care to bother to prove you wrong again, you can do the foot work.

            5. You need a single licence at point of decoding. If you have a 7.1 DTS-MA, DTS, TrueHD, AC-3 (DD+) receiver, it already has the licence. You do not need a licence to bitstream (which is to say copy the file or part of the m2ts file that contains the audio track in the case of bluray) from one device or disc to a device that can decode it. If you have questions, see FFMPEG that didn’t include AAC, which is not bitstreamed and needs to be decoded until they wrote their own non-ip encumbered decoder, and even that is still not anything but beta quality. But they do fully support TrueHD and DTS-MA out of the box without legal issues.

            6. I sure do. VP8 is the codec developed by google, and their “free open source” container is a subset of MKV. BluRay is NOT a container format. It is a physic transport layer. BluRay utilizes m2ts as it’s container.

            7. No it isn’t. See previous proof.

            8. I could, if I had time write my own MF splitter for Windows 8 and give it to companies to include in their apps right now so that they could support MKV. I suspect that VLC will include this and we’ll see native MKV support soon in essentially all apps as a result.

            I could write my own bitstreamer for Windows 8 if MS would open up bitstream support to the HDMI channel instead of how it works now which requires all audio to be decoded to PCM or LPCM before transport over HDMI. It is this single limitation that prevents the bitstreaming of audio without licence. As soon as MS opens up the bitstream channel for HDMI, which Windows 8 supports, just Windows 8 apps don’t, then I could give away my own bitstream functionality to all devs and they could include it and not have to pay royalties.

            But of course you don’t know what you’re talking about, so you don’t realize that the reason why only Bluray playing software works is because MS is forcing decode of audio right now, and thus no one is bitstreaming anything in Windows 8 other than PCM/LPCM and as soon as that’s fixed, there will be no limitation to bitstreaming to receivers that are licenced to decode.

          • John Galt

            As for your cost BS, it costs about $1.50 to store a 14 gb H264 encoded movie with High def audio tracks. For the sake of having all of the meta-data, and convenience for my child that I wouldn’t trust to handle Blu Rays, and my wife who hates all of the discs being around, it’s more than worth it.

            So get yourself educated before you open your mouth. You look like more than a fool.

          • Christopher Walsh

            Ummmmm: http://www.pivosgroup.com/aios.html

            I have that player and it bitstreams TrueHD and DTSHD MA.

          • John Galt

            The bluray players in the store all decode TrueHD and DTS MA in Windows 8 and send it lossless over HDMI. Thus they have to pay for a licence to do so because they decode the audio.

            What I’m asking for is the ability to bitstream the audio file/track from the m2ts, mkv, or mp4 file over the wire just like copying a file and have the receiver do the decoding. This isn’t possible at this time in Windows 8 store apps. (it is possible in legacy apps of course).

            What I really want them to do is include the Media Foundation splitters for MKV (or what MS is thinking about doing at my urging which is create an open source project on code project for it) and then enable bitstream over HDMI without decode. Once they do that, Windows 8 modern apps will have the same power as desktop apps with LAV installed.

  • GregJG

    PS4 Day one for Me!

    • poupou

      well no one is perfect

  • Tony

    It’s funny that Sony fanboys feel the need to come to a Durango post to comment how they’re only going to buy the PS4.. Worried much :)

    • Joe Weider

      Well.. Now that the Xbox One is revealed , which one are you gonna’ buy ?

      • lolx19

        The Xbox

  • Maynard_VGL

    Derek, that’s not the kind of comment we want in Vgleaks.

  • wint3rmute

    This is an article about audio tech, whats wrong with you.

  • iDon’tWantToShareMyDetails

    Great, about damn time to see a movement in the Audio department. Too bad its proprietary and thus the titles supporting it would be a lot less.

  • scrapplejoe

    Stoked for the new Xbox…. Ill buy it first, then the ps4 maybe a year later.. fanyboys are funny… thank god i have $$ to buy all systems =)

  • Shulk

    I’ll stick to my PS3 mostly since my Xbox 360 is RROD but will pick up another one in due time.

  • Hairee Pothead

    The PC-Twins destroy the Wii U in sound as well. The Wii U has a 120MHZ 32 Bit version of the 16 bit 81MHZ Macronix DSP found in the GameCube and Wii. The Wii U can only do around 100 voices plus effects which is only slightly better than the 64 voices that GameCube and Wii could do. The PC-Twins have more powerful sound chips that can both exceed 256 voices,