Normalization is the active level matching of file-based audio content, such as a record album or radio program, to a defined “target loudness.” Normalizing avoids listener annoyance by matching loudness from one content asset to the next but it does not affect the quality of the content. At its most basic, file-based audio content is normalized by these steps:
For more information on loudness measurement see the Loudness Basics section.
If the speech elements are not reliably measurable, such as when the speech is combined with music beds, sound effects, etc.,TD1008 recommends that the overall content be normalized to ‑18 LUFS before encoding and distribution.
When the content type is music, such as with on-demand music playback services, normalization is performed differently for full “albums” than for individual “tracks.”
Much recorded content is mastered in album format, where the loudness, order, and spacing of the tracks is adjusted for artistic balance or intent. If the loudness of each track was measured separately and the tracks normalized to a common target level, the original loudness intent of mastering would be lost. Album normalization prevents this by adjusting the loudness of all tracks in the album by the same gain value, maintaining the “album loudness” of the tracks relative to one another.
It is recommended that this album loudness value be that of the loudest track on the album instead of a loudness measurement of the entire album.
Additionally, if album loudness is available, studies indicate it is preferable to normalize the playback of individual tracks to the album loudness, even if they are played outside an album context.
The Album and Track Normalization Techniques animation begins with the Original Album section on the left, with four songs in an album group: each has a different Integrated Loudness varying from 11 LUFS to 16 LUFS (shown by the dashed bars).
Album Normalization is shown next in this animation: watch as the gains of all four songs are reduced together so that only the loudest track is 14 LUFS. Adjusting the gains in unison maintains the loudness differences from song to song, as intended by the content creator. This process is well-suited to on-demand music playback systems, which work with audio files.
Last, the Track Normalization is shown: each track gain is adjusted separately, as is common in radio-style production. In this case, TD1008 recommends adjusting each track to 16 LUFS for best matching with other online content. The drawback here is the loudness for all tracks is the same, whether the song was intended to be loud or soft.
The focus of this website is loudness for online audio content, but in discussion of target levels would not be complete without so some mention of other multimedia services.
Broadcast television began using integrated loudness (or average dialogue loudness) and normalization of audio worldwide in the 2005-2010 time frame. This was driven by viewer complaints of loud commercial content. The leading standards, listed in AES71 specify a similar 24 or 23 LUFS (or LKFS) target for both production and distribution to consumers.
Online video services, known as “Over the Top Television” (OTT) quickly became a major form of distribution and have adopted guidelines based on the recommendations documented as ANSI A/85, EBU R128 and AES71, and in some cases with customization to help support wide-dynamic range content and viewing on mobile devices.
On-demand music services distribute mostly popular commercial music, which are still driven by album production styles with very high integrated loudness. Fortunately, these services are adopting moderate loudness targets not only to help match music and other content, but to help provide more headroom to improve dynamic quality.
Many distributors of audiovisual or audio-only content are now using loudness normalization, but with different targets. Clearly, this is a challenging spread when audio is played on shared devices. Different operating loudness values is why the AES Broadcast and Online Delivery Technical Committee put forth updated recommendations in TD1008. Eventually, all content will carry metadata to identify its loudness target, which will allow playback devices to correct output to a common Integrated Loudness.
For live content or when file-based normalization is not available, hardware and software devices are available to provide automated real-time correction to a target loudness. These devices can meet long-term loudness goals and may even avoid sudden increases in loudness, but they should be used advisedly: real time processes cannot “predict the future” for the remainder of a live program or stream, or know the artist’s intent to be louder or softer at times. Automatic correction may at times mistakenly adjust loudness and override the intended changes in a program’s dynamic range.
Active loudness controllers have algorithms to make aesthetically acceptable adjustment decisions. Most controllers work on these basic principles:
Real time correction has been in use in TV and OTT video for several years and is now appearing in products for online audio services.