Audio details of DVD-Video

The following details are for audio tracks in DVD-Video. Some DVD manufacturers such as Pioneer are developing audio-only players using the DVD-Video format. Some DVD-Video discs contain mostly audio with only still pictures.

A DVD-Video disc can have up to 8 audio tracks (streams) associated with each video track (or each video angle). Each audio track can be in one of three formats:

  • Dolby Digital (AC-3): 1 to 5.1 channels

  • MPEG-2 audio: 1 to 5.1 or 7.1 channels

  • PCM: 1 to 8 channels.

Two additional optional formats are provided: DTS and SDDS. Both require the appropriate decoders and are not supported by all players.

The ".1" refers to a low-frequency effects (LFE) channel that connects to a subwoofer. This channel carries an emphasized bass audio signal.

Linear PCM is uncompressed (lossless) digital audio, the same format used on CDs and most studio masters. It can be sampled at 48 or 96 kHz with 16, 20, or 24 bits/sample. (Audio CD is limited to 44.1 kHz at 16 bits.) There can be from 1 to 8 channels. The maximum bit rate is 6.144 Mbps, which limits sample rates and bit sizes when there are 5 or more channels. It's generally felt that the 120 dB dynamic range of 20 bits combined with a frequency response of around 22,000 Hz from 48 kHz sampling is adequate for high-fidelity sound reproduction. However, additional bits and higher sampling rates are useful in audiophile applications, studio work, noise shaping, advanced digital processing, and three-dimensional sound field reproduction. DVD players are required to support all the variations of LPCM, but many subsample 96 kHz down to 48 kHz, and some may not use all 20 or 24 bits. The signal provided on the digital output for external digital-to-analog converters may be limited to less than 96 kHz and less than 24 bits.
Dolby Digital is multi-channel digital audio, using lossy AC-3 coding technology from PCM source with a sample rate of 48 kHz at up to 24 bits. The bitrate is 64 kbps to 448 kbps, with 384 or 448 being the normal rate for 5.1 channels and 192 being the typical rate for stereo (with or without surround encoding). (Most Dolby Digital decoders support up to 640 kbps, so non-standard discs with 640 kbps tracks play on many players.) The channel combinations are (front/surround): 1/0, 1+1/0 (dual mono), 2/0, 3/0, 2/1, 3/1, 2/2, and 3/2. The LFE channel is optional with all 8 combinations. For details see ATSC document A/52 <>. Dolby Digital is the format used for audio tracks on almost all DVDs.
MPEG audio is multi-channel digital audio, using lossy compression from original PCM format with sample rate of 48 kHz at 16 or 20 bits. Both MPEG-1 and MPEG-2 formats are supported. The variable bit rate is 32 kbps to 912 kbps, with 384 being the normal average rate. MPEG-1 is limited to 384 kbps. Channel combinations are (front/surround): 1/0, 2/0, 2/1, 2/2, 3/0, 3/1, 3/2, and 5/2. The LFE channel is optional with all combinations. The 7.1 channel format adds left-center and right-center channels, but is rare for home use. MPEG-2 surround channels are in an extension stream matrixed onto the MPEG-1 stereo channels, which makes MPEG-2 audio backwards compatible with MPEG-1 hardware (an MPEG-1 system will only see the two stereo channels.) MPEG Layer 3 (MP3) and MPEG-2 AAC (also known as NBC or unmatrix) are not supported by the DVD-Video standard. MPEG audio is not used much on DVDs, although some inexpensive DVD recording software programs use MPEG audio, even on NTSC discs, which goes against the DVD standard and is not supported by all NTSC players.
DTS (Digital Theater Systems) Digital Surround is an optional multi-channel digital audio format, using lossy compression from PCM at 48 kHz at up to 24 bits. The data rate is from 64 kbps to 1536 kbps, with typical rates of 754.5 and 1509.25 for 5.1 channels and 377 or 754 for 2 channels. (The DTS Coherent Acoustics format supports up to 4096 kbps variable data rate for lossless compression, but this isn't supported by DVD. DVD also does not allow DTS sampling rates other than 48 kHz.). Channel combinations are (front/surround): 1/0, 2/0, 3/0, 2/1, 2/2, 3/2. The LFE channel is optional with all combinations. DTS ES support 6.1 channels in two ways: 1) a Dolby Surround EX compatible matrixed rear center channel, 2) a discrete 7th channel. DTS also has a 7.1-channel mode (8 discrete channels), but no DVDs have used it yet. The 7-channel and 8-channel modes require a new decoder. The DVD standard includes an audio stream format reserved for DTS, but many older players ignore it. The DTS format used on DVDs is different from the one used in theaters (Audio Processing Technology's apt-X, an ADPCM coder, not a psychoacoustic coder). All DVD players can play DTS audio CDs, since the standard PCM stream holds the DTS code. See 1.32 for general DTS information. For more info visit and read Adam Barratt's article.
SDDS (Sony Dynamic Digital Sound) is an optional multi-channel (5.1 or 7.1) digital audio format, compressed from PCM at 48 kHz. The data rate can go up to 1280 kbps. SDDS is a theatrical film soundtrack format based on the ATRAC compression format that is also used by Minidisc. Sony has not announced any plans to support SDDS on DVD.
THX (Tomlinson Holman Experiment) is not an audio format. It's a certification and quality control program that applies to sound systems and acoustics in theaters, home equipment, and digital mastering processes. The LucasFilm THX Digital Mastering program uses a patented process to track video quality through the multiple video generations needed to make a final format disc or tape, setup of video monitors to ensure that the filmmaker is seeing a precise rendition of what is on tape before approval of the master, and other steps along the way. THX-certified "4.0" amplifiers enhance Dolby Pro Logic in the following ways: a crossover that sends bass from front channels to subwoofer; re-equalization on front channels (to compensate for high-frequency boost in theater mix designed for speakers behind the screen); timbre matching on rear channels; decorrelation of rear channels; a bass curve that emphasizes low frequencies. THX-certified "5.1" amplifiers enhance Dolby Digital and improve on 4.0 in the following ways: rear speakers are full range, so the crossover sends bass from both front and rear to the subwoofer; decorrelation is turned on automatically when rear channels have the same audio, but not during split-surround effects, which don't need to be decorrelated. More info at Home THX Program Overview.

Discs containing 525/60 (NTSC) video must use PCM or Dolby Digital on at least one track. Discs containing 625/50 (PAL/SECAM) video must use PCM or MPEG audio or Dolby Digital on at least one track. Additional tracks may be in any format. A few first-generation players, such as those made by Matsushita, can't output MPEG-2 audio to external decoders.

The original DVD-Video spec required either MPEG audio or PCM on 625/50 (PAL) discs. There was a brief scuffle led by Philips when early discs came out with only two-channel MPEG and multi-channel Dolby Digital, but the DVD Forum clarified in May of 1997 that only stereo MPEG audio was mandatory for 625/50 discs. In December 1997 the lack of MPEG-2 encoders (and decoders) was a big enough problem that the spec was revised to allow Dolby Digital audio tracks to be used on 625/50 discs without MPEG audio tracks.

Because of the 4% speedup from 24 fps film to 25 fps PAL display, the audio must be adjusted to match before it is encoded. Unless the audio is digitally processed to shift the pitch back to normal it will be slightly high (about half a semitone).

For stereo output (analog or digital), all players have a built-in 2-channel Dolby Digital decoder that down-mixes from 5.1 channels (if present on the disc) to Dolby Surround stereo. That is, 5 channels are phase matrixed into 2 channels to be decoded to 4 channels by a Dolby Pro Logic processor or 5 channels by a Pro Logic II processor. PAL players also have an MPEG or MPEG-2 audio decoder. Both Dolby Digital and MPEG-2 support 2-channel Dolby Surround as the source in cases where the disc producer can't or doesn't want to remix the original onto discrete channels. This means that a DVD labeled as having Dolby Digital sound may only use the L/R channels for surround or "plain" stereo. Even movies with old monophonic soundtracks may use Dolby Digital with only 1 or 2 channels. Some players can optionally downmix to non-surround stereo. If surround audio is important to you, you will hear significantly better results from multichannel discs if you have a Dolby Digital system.

The new Dolby Digital Surround EX format (DD-EX), which adds a rear center channel, is compatible with DVD discs and players, and with existing Dolby Digital decoders. The new DTS-ES Matrix format, which likewise adds a rear center channel, works with existing DTS decoders and with DTS-compatible DVD players. However, for full use of either new format you need a new decoder to extract the rear center channel, which is phase matrixed into the two standard rear channels in the same way Dolby Surround is matrixed into standard stereo channels. Without a new decoder you'll get the same 5.1-channel audio you get now. Because the additional rear channel isn't a full-bandwidth discrete channel, it's appropriate to call the new formats "5.2-channel" digital surround. There is also DTS-ES Discrete, which adds a full-bandwidth discrete rear center channel in an extension stream which is used by DTS ES Discrete decoders but ignored by older DTS decoders. DTS-ES decoders include DTS Neo:6, which is not an encoding format but a matrix decoding process that provides 5 or 6 channels.

The Dolby Digital downmix process does not usually include the LFE channel and may compress the dynamic range in order to improve dialog audibility and keep the sound from becoming "muddy" on average home audio systems. This can result in reduced sound quality on high-end audio systems. The downmix is auditioned when the disc is prepared, and if the result is not acceptable the audio may be tweaked or a separate L/R Dolby Surround track may be added. Experience has shown that minor tweaking is sometimes required to make the dialog more audible within the limited dynamic range of a home stereo system. Some disc producers include a separately mixed stereo track rather than fiddle with the surround mix.

The Dolby Digital dynamic range compression (DRC) feature, often called midnight mode, reduces the difference between loud and soft sounds so that you can turn the volume down to avoid disturbing others yet still hear the detail of quiet passages. Some players have the option to turn off DRC.

Dolby Digital also includes a feature called dialog normalization (DN), which should more accurately be called volume standardization. DN is designed to keep the sound level the same when switching between different sources. This will become more important as additional Dolby Digital sources (digital satellite, DTV, etc) become common. Each Dolby Digital track contains loudness information so that the receiver can automatically adjust the volume, turning it down, for example, on a loud commercial. (Of course the commercial makers can cheat and set an artificially low DN level, causing your receiver to turn up the volume during the commercial.) Turning DN on or off on your receiver has no effect on dynamic range or sound quality; its effect is no different than turning the volume control up or down.

All five DVD-Video audio formats support karaoke mode, which has two channels for stereo (L and R) plus an optional guide melody channel (M) and two optional vocal channels (V1 and V2).

A DVD-5 with only one surround stereo audio stream (at 192 kbps) can hold over 55 hours of audio. A DVD-18 can hold over 200 hours.

For more information about multichannel surround sound, see Bobby Owsinski's FAQ at.

DVD-Audio and SACD

LPCM is mandatory in DVD-Audio discs, with up to 6 channels at sample rates of 48/96/192 kHz (also 44.1/88.2/176.4 kHz) and sample sizes of 16/20/24 bits. This allows theoretical frequency response of up to 96 kHz and dynamic range of up to 144 dB. Multichannel PCM is downmixable by the player, although at 192 and 176.4 kHz only two channels are available. Sampling rates and sizes can vary for different channels by using a predefined set of groups. The maximum data rate is 9.6 Mbps.

The DVD Forum's Working Group for audio (WG4) decided to include lossless compression, and on August 5, 1998 approved Meridian's MLP (Meridian Lossless Packing) scheme, licensed by Dolby. MLP removes redundancy from the signal to achieve a compression ratio of about 2:1 while allowing the PCM signal to be completely recreated by the MLP decoder that's required in all DVD-Audio players. MLP allows playing times of about 74 to 135 minutes of 6-channel 96-kHz/24-bit audio on a single layer (compared to 45 minutes without packing). Two-channel 192-kHz/24-bit playing times are about 120 to 140 minutes (compared to 67 minutes without packing).

Other audio formats of DVD-Video (Dolby Digital, MPEG audio, and DTS, described below) are optional on DVD-Audio discs, although Dolby Digital is required for audio content that has associated video. A subset of DVD-Video features (no angles, no seamless branching, etc.) is allowed. Most DVD-Audio players are also "universal" players that play DVD-Video discs as well.

DVD-Audio includes specialized downmixing features for PCM channels. Unlike DVD-Video, where the decoder determines how to mix from 6 channels down to 2, DVD-Audio includes coefficent tables to control mixdown and avoid volume buildup from channel aggregation. Up to 16 tables can be defined by each Audio Title Set (album), and each track can be identified with a table. Coefficients range from 0dB to 60dB. This feature goes by the horribly contrived name of SMART (system-managed audio resource technique). (Dolby Digital, supported in both DVD-Audio and DVD-Video, also includes downmixing information that can be set at encode time.)

DVD-Audio can provide up to 99 still images per track (at typical compression levels about 20 images fit into the 2 MB buffer in the player), with a set of limited transitions (cut in/out, fade in/out, dissolve, and wipe). Unlike DVD-Video, the user can move at will through the slides without interrupting the audio as it plays: this is called a browsable slideshow. On-screen displays can be used for synchronized lyrics and navigation menus. A special simplified navigation mode can be used on players without a video display.

Sony and Philips are promoting SACD, a competing DVD-based format using Direct Stream Digital (DSD) encoding with sampling rates of 2.8224 MHz. DSD is based on the pulse-density modulation (PDM) technique that uses single bits to represent the incremental rise or fall of the audio waveform. This supposedly improves quality by removing the brick wall filters required for PCM encoding. It also makes downsampling more accurate and efficient. DSD provides a frequency response from DC to over 100 kHz with a dynamic range of over 120 dB. DSD includes a lossless encoding technique that produces approximately 2:1 data reduction by predicting each sample and then run-length encoding the error signal. The maximum data rate is 2.8 Mbps.

SACD includes a physical watermarking feature, pit signal processing (PSP), which modulates the width of pits on the disc to store a digital watermark (data is stored in the pit length). The optical pickup must contain additional circuitry to read the PSP watermark, which is then compared to information on the disc to make sure it's legitimate. Because of the requirement for specialized watermark detection circuitry, protected SACD discs are not playable in standard DVD-ROM drives.

SACD includes text and still graphics, but no video. Sony says the format is aimed at audiophiles and is not intended to replace the audio CD format.

