How to Control Vocal Sibilance

Vocal sibilance is an unpleasant tonal harshness that can happen during consonant syllables (like S, T, and Z), caused by disproportionate audio dynamics in upper midrange frequencies.

Sibilance is often centered between 5kHz to 8kHz, but can occur well above that frequency range.

This problem is usually caused by the actual vocal formant, but can also be exaggerated by microphone placement and technique. This article will discuss some ways to control vocal sibilance, and keep the problem from becoming a musical distraction.

Sibilance at the Source (best read with sibilant whistle)

In phonetic terms, sibilance comes from a type of vocal formant called a fricative consonant. During these sorts of utterances, the airway (usually the mouth) is drastically constricted by two anatomical features, like the teeth, tongue, or palette.

This pressurization causes some amount of noise that forms the consonant sounds we would recognize from a phase like, “Sally sits sideways on the tennis trolley.” Sibilance is a very necessary feature of human speech, but when there’s (subjectively) too much noise created during these consonants, we get a very distracting harshness.

It isn’t really practical or productive to address micro-muscular vocal technique during a session, so your best bet to mitigate sibilance at the source is microphone selection and placement. Here are a few suggestions:

Every vocalist is remarkably different, so don’t pre-suppose that anything you’ve tried before will or will not work again.
Be sure to leave some space between your vocalist and the microphone. Twelve to eighteen inches would be a nice starting point.
A pop filter won’t do anything to help with sibilance.
Once you find a microphone and distance combination that helps, try angling the microphone downward 10 to 15 degrees to place the 0-degree axis toward the throat instead of the sibilant source.

Audio Dynamics Processing

Vocal sibilance is a phenomenon of disproportionate dynamics within an isolated frequency range. In other words, it is a problem of too much loudness contrast within a small frequency range of a waveform that has a dynamic profile of its own.

‘De-essing’ is the classic compressor technique used to address vocal sibilance through processing. In fact, de-essing is just one example of many uses for compression that is conditioned on a limited frequency band, or a modified harmonic profile.

De-esser Signal Flow

Audio dynamics processors like compressors and expanders contain two signal paths:

The audio path, which is subject to conditional gain reduction and;
The sidechain or ‘key’ path, which the gain reduction is conditioned on.

In short, gain reduction happens (or not) in the audio path based on the interaction between the sidechain signal and the detector settings (i.e. threshold and time constants). By placing an EQ in the sidechain path, we can further condition gain reduction on user definable frequency conditions.

The de-esser technique typically uses a narrow peak EQ in the sidechain path to boost the most offensive sibilant frequencies. This EQ exaggerates the dynamic difference between the sibilant band and the rest of the vocal waveform, making it much easier to achieve gain reduction during those consonants (and only then).

A pre-configured de-esser may provide an interface as simple as a compressor threshold and the peak EQ center frequency. These often work just fine. For more detailed control, one could patch an EQ into the sidechain of a relatively fast compressor, or use any number of compressor plug-ins that provide detailed EQ in the sidechain path.

There are lots of great techniques based on this signal flow, so spend some time with it. Frankly, de-essing is the least of what you can do by adding frequency conditions to your gain reduction.

Other Precautions

When you’re recording a vocal performance that may have a sibilance problem, resist the urge to compress the signal in the channel path. Over-compression can exaggerate sibilance. Instead, try using a fader to level the vocal performance, or just record with an adequate amount of headroom.

The same applies to the mixing process. Once you’ve done your best to control vocal sibilance, try using a fader and automation to maintain a consistent vocal volume in the mix. If you simply must instantiate a compressor on every vocal track, keep the attack time slow (> 30ms), and the ratio low.

Finally, don’t listen too loudly when you mix. That’s good general advice, but quality control issues like sibilance highlight its importance. Try a control room volume of 78–83dB© SPL. You might be surprised how much detail you’re suddenly able to hear.

For more articles about recording, mixing, and production visit The Pro Audio Files, brought to you by Dan Comerchero.