Loudness Normalization

Normalization Technique

Normalization is the active level matching of file-based audio content, such as a record album or radio program, to a defined “target loudness.” Normalizing avoids listener annoyance by matching loudness from one content asset to the next but it does not affect the quality of the content. At its most basic, file-based audio content is normalized by these steps:

  • The full length of the audio content is measured for its Integrated Loudness, in LUFS. See the Loudness Basics section for measurement
  • The amplitude of the entire audio content is then corrected so that the Integrated Loudness matches the target loudness. For example, if the target loudness is ‑24 LUFS (which is often used for audio content creation), and the content measures -27 LUFS, a gain offset of +3 LU to the content produces the target loudness.

For more information on loudness measurement see the Loudness Basics section.

When applying normalization, the loudness of the original audio (shown on the left in the figure) might be above or below the desired Distribution Loudness. In this case its Integrated Loudness is -13 LUFS, which is greater than the “target” of -18 LUFS. (See AES TD1008 for specific recommendations on target loudness.) Normalization attenuates the audio by 5 LU to the target value. This process is commonly called Downward Normalization.
Figure 1 - Illustration of Downward Normalization and Upward Normalization to a Target Loudness of -18 LUFS

Normalization Workflows

In the distribution of file-based audio for consumers, loudness normalization is often an automatic procedure. This ensures that the content receives the optimal handling and plays at the loudness generally desired by listeners. In the diagram, content is classified as music-only or as assorted – a mix of music, speech, interstitials etc. (Music-only content is normalized in a special manner, as discussed below in Album and Track Normalization.)
Figure 2 - Example of normalization workflow for content distribution
If assorted content has speech elements that are measurable (usually with automatic software) then the gain of the overall content is adjusted so that the dialog Integrated Loudness is ‑18 LUFS. Note that the gain is set only once: no gain-riding is performed, so the content producer’s mix is preserved when encoded for distribution.

If the speech elements are not reliably measurable, such as when the speech is combined with music beds, sound effects, etc.,TD1008 recommends that the overall content be normalized to ‑18 LUFS before encoding and distribution.

When the content type is music, such as with on-demand music playback services, normalization is performed differently for full “albums” than for individual “tracks.”

Album and Track Normalization of Music for Distribution

Much recorded content is mastered in album format, where the loudness, order, and spacing of the tracks is adjusted for artistic balance or intent. If the loudness of each track was measured separately and the tracks normalized to a common target level, the original loudness intent of mastering would be lost. Album normalization prevents this by adjusting the loudness of all tracks in the album by the same gain value, maintaining the “album loudness” of the tracks relative to one another.

It is recommended that this album loudness value be that of the loudest track on the album instead of a loudness measurement of the entire album.

Additionally, if album loudness is available, studies indicate it is preferable to normalize the playback of individual tracks to the album loudness, even if they are played outside an album context.

The Album and Track Normalization Techniques animation begins with the Original Album section on the left, with four songs in an album group: each has a different Integrated Loudness varying from 11 LUFS to 16 LUFS (shown by the dashed bars).

Album Normalization is shown next in this animation: watch as the gains of all four songs are reduced together so that only the loudest track is 14 LUFS. Adjusting the gains in unison maintains the loudness differences from song to song, as intended by the content creator. This process is well-suited to on-demand music playback systems, which work with audio files.

Last, the Track Normalization is shown: each track gain is adjusted separately, as is common in radio-style production. In this case, TD1008 recommends adjusting each track to 16 LUFS for best matching with other online content. The drawback here is the loudness for all tracks is the same, whether the song was intended to be loud or soft.

Normalization Targets

The focus of this website is loudness for online audio content, but in discussion of target levels would not be complete without so some mention of other multimedia services.

Broadcast television began using integrated loudness (or average dialogue loudness) and normalization of audio worldwide in the 2005-2010 time frame. This was driven by viewer complaints of loud commercial content. The leading standards, listed in AES71 specify a similar 24 or 23 LUFS (or LKFS) target for both production and distribution to consumers.

Online video services, known as “Over the Top Television” (OTT) quickly became a major form of distribution and have adopted guidelines based on the recommendations documented as ANSI A/85, EBU R128 and AES71, and in some cases with customization to help support wide-dynamic range content and viewing on mobile devices.

On-demand music services distribute mostly popular commercial music, which are still driven by album production styles with very high integrated loudness. Fortunately, these services are adopting moderate loudness targets not only to help match music and other content, but to help provide more headroom to improve dynamic quality.

Many distributors of audiovisual or audio-only content are now using loudness normalization, but with different targets. Clearly, this is a challenging spread when audio is played on shared devices. Different operating loudness values is why the AES Broadcast and Online Delivery Technical Committee put forth updated recommendations in TD1008. Eventually, all content will carry metadata to identify its loudness target, which will allow playback devices to correct output to a common Integrated Loudness.

Active and Real Time Loudness Correction

For live content or when file-based normalization is not available, hardware and software devices are available to provide automated real-time correction to a target loudness. These devices can meet long-term loudness goals and may even avoid sudden increases in loudness, but they should be used advisedly: real time processes cannot “predict the future” for the remainder of a live program or stream, or know the artist’s intent to be louder or softer at times.  Automatic correction may at times mistakenly adjust loudness and override the intended changes in a program’s dynamic range.

Active loudness controllers have algorithms to make aesthetically acceptable adjustment decisions. Most controllers work on these basic principles:

  • Audio is analyzed using a method where the integrated LUFS value is based on a user-defined rolling average from several seconds to many minutes.
  • The analysis period balances effective control of average loudness levels while maintaining most of the “natural dynamics” of the program source material.
  • Longer integration times usually yield natural-sounding loudness normalization results with less audible artifacts.
  • Peak limiting is provided since gain may be applied to the controlled audio.
  • Resulting normalized audio is then available at the output of the “LUFS normalizer” to be used for general distribution.

Real time correction has been in use in TV and OTT video for several years and is now appearing in products for online audio services.

Choose your country of residence from this list:

Skip to content