It's a dirty job to go from high-res audio to 44/16, but someone's got to do it.
The ultimate form of digital audio used to have a 16-bit word length and 44.1 kHz sampling rate. Early systems even did their internal processing at 16/44.1, which was a problem—every time you did an operation (such as change levels, or apply EQ), the result was always rounded off to 16 bits. If you did enough operations, these roundoff errors would accumulate, creating a sort of “fuzziness” in the sound.
The next step forward was increasing the internal resolution of digital audio systems. If a mathematical operation created an “overflow” result that required more than 16 bits, no problem: 24, 32, 64, and even 128-bit internal processing became commonplace. As long as the audio stayed within the system, running out of resolution wasn’t a problem.
Nowadays, your hard disk recorder most likely records and plays back at 24, 32, or 64 bits, and the rest of your gear (digital mixer, digital synth, etc.) probably has fairly high internal resolution as well. But currently, although there are some high-resolution audio formats, your mix usually ends up in the world’s most popular delivery medium: a 16-bit, 44.1kHz CD.
What happens to those “extra” bits? Before the advent of dithering, they were simply discarded (just imagine how those poor bits felt, especially after being called the “least significant bits” all their lives). This meant that, for example, decay tails below the 16-bit limit just stopped abruptly. Maybe you’ve heard a “buzzing” sort of sound at the end of a fade out or reverb tail; that’s the sound of extra bits being ruthlessly “downsized.”
Dithering to the Rescue
Dithering is a concept that, in its most basic form, adds noise to the very lower-level signals, thus using the data in those least significant bits to influence the sound of the more significant bits. It’s almost as if, even though the least significant bits are gone, their spirit lives on in the sound of the recording.
Cutting off bits is called truncation, and some proponents of dithering believe that dithering somehow sidesteps the truncation process. But that’s a misconception. Dithered or not, when a 24-bit signal ends up on a 16-bit CD, eight bits are truncated and never heard from again. Nonetheless, there’s a difference between flat-out truncation and truncation with dithering.
The Trouble with Truncation
The reason why you hear a buzzing at the end of fades with truncated signals is that the least significant bit, which tries to follow the audio signal, switches back and forth between 0 and 1. This buzzing is called quantization noise, because the noise occurs during the process of quantizing the audio into discrete steps. In a 24-bit recording, there are 256 different possible levels (i.e., the lower 8 bits beyond 16 bits) between that “on” and “off” condition; but once the recording has been truncated, the resolution is no longer there to reproduce those changes.
Bear in mind, though, that these are very low-level signals. For that punk rock-industrial-dance mix where all the meters are in the red, you probably don’t need even 16 bits of resolution. But when you’re trying to record the ambient reverb tail of an acoustic space, you need good low-level resolution.
How Dithering Works
Let’s assume a 24-bit recorded signal so we can work with a practical example. The dithering process adds random noise to the lowest eight bits of the 24-bit signal. This noise is different for the two channels in order not to degrade stereo separation.
It may seem odd that adding noise can improve the sound, but one analogy is the bias signal used in analog tape. Analog tape is linear (distortionless) only over a very narrow range. We all know that distortion occurs if you hit tape too hard, but signals below a certain level can also sound horribly distorted. The bias signal adds a constant supersonic signal (so we don’t hear it) whose level sits at the lower threshold of the linear region. Any low-level signals get added to the bias signal, which boosts them into the linear region, where they can be heard without distortion.
Adding noise to the lower eight bits increases their amplitude and pushes some of the information contained in those bits into the higher bits. Therefore, the lowest part of the dynamic range no longer correlates directly to the original signal, but to a combination of the noise source and information present in the lowest eight bits. This reduces the quantization noise, providing in its place a smoother type of hiss modulated by the lower-level information. The most obvious audible benefit is that fades become smoother and more realistic, but there’s also more sonic detail.
Although adding noise seems like a bad idea, psycho-acoustics is on our side. Because any noise added by the dithering process has a constant level and frequency content, our ears have an easy time picking out the content (signal) from the noise. We’ve lived with noise long enough that a little bit hanging around at 90dB or so is tolerable, particularly if it allows us to hear a subjectively extended dynamic range.
However, there are different types of dithering noise, which exhibit varying degrees of audibility. The dither may be wideband, thus trading off the lowest possible distortion for slightly higher perceived noise. A narrower band of noise will sound quieter, but lets some extremely low-level distortion remain.
Shape that Noise
iZtope’s Ozone mastering plug-in has a dithering section with multiple types of dithering, noise shaping options, the ability to “downsize” to bit depths from 8 to 24, and a choice of dither amount.
To render dithering even less problematic, noiseshaping distributes noise across the spectrum so that the bulk of it lies where the ear is least sensitive (i.e., the higher frequencies). Some noiseshaping curves are extremely complex—they’re not just a straight line, but also dip down in regions of maximum sensitivity (typically the midrange).
Again, this recalls the analogy of analog tape’s bias signal, which is usually around 100kHz to keep it out of the audible range. We can’t get away with those kinds of frequencies in a system that samples at 44.1 or even 96kHz, but several noise-shaping algorithms push the signal as high as possible, short of hitting the Nyquist frequency (i.e., half the sample rate, which is the highest frequency that can be recorded and played back at a given sample rate).
Different manufacturers use different noise-shaping algorithms; judging these is a little like wine-tasting. Sometimes you’ll have a choice of dithering and noiseshaping algorithms so you can choose the combination that works best for specific types of program material. Not all these algorithms are created equal, nor do they sound equal.
Dithering Rules
The First Law of dithering is don’t dither a signal more than once. Dithering should happen only when converting a high bit-rate source format to its final, 16-bit, mixed-for-CD format (and in the years to come, we’ll probably be dithering our 32 or 64-bit internal processing systems down to 24 bits for whatever high-resolution format finally takes off).
For example, if you are given an already dithered 16-bit file to edit on a high-resolution waveform editor, that 16-bit file already contains dithered data, and the higher-resolution editor should preserve it. When it’s time to mix the edited version back down to 16 bits, simply transfer over the existing file without dithering.
Another possible problem occurs if you give a mastering or duplication facility two dithered 16-bit files that are meant to be crossfaded. Crossfading the dithered sections could lead to artifacts; you’re better off crossfading the two, then dithering the combination.
Also, check any programs you use to see if dithering is enabled by default, or enabled accidentally and saved as a preference. In general, you want to leave dithering off, and enable it only as needed.
Or consider Cubase SX, which has an Apogee-designed UV22 plug-in. Suppose you add this to the final output, then suppose ou add another plug-in, like the Waves L1-Ultramaximizer+. This also includes dithering, which defaults to being enabled when inserted. So, check carefully to make sure you’re not “doubling up” on dithering, and disable dithering in one or the other.
If you insert dithering in Cubase SX, it defaults to being enabled. So if you use this, make sure that any other master effects plug-ins you add do not have dithering enabled (in this screen shot, the WAVES dithering has been turned off). Or, disable Cubase’s dithering section and use the other plug-in’s dithering instead.
The best way to experience the benefits of dithering is to crank up some really low-level audio and compare different dithering and noise-shaping algorithms. If your music has any natural dynamics in it, proper dithering can indeed give a sweeter, smoother sound free of digital quantization distortion when you downsize to 16 bits.
Originally published on Harmony Central. Reprinted with permission.