Home / Loudness Normalization
Normalization is the active level matching of file-based audio content, such as a record album or radio program, to a defined “target loudness.” Normalizing avoids listener annoyance by matching loudness from one content asset to the next but it does not affect the quality of the content. At its most basic, file-based audio content is normalized by these steps:
For more information on loudness measurement see the Loudness Basics section.
When applying normalization, the loudness of the original audio (shown on the left in the figure) might be above or below the desired Distribution Loudness. In this case its Integrated Loudness is -13 LUFS, which is greater than the “target” of -18 LUFS. (See AES TD1008 for specific recommendations on target loudness.) Normalization attenuates the audio by 5 LU to the target value. This process is commonly called Downward Normalization.
In the distribution of file-based audio for consumers, loudness normalization is often an automatic procedure. Automated loudness normalization utilities are designed to not only be useful on single (one-off) media files, but also as a batch process that can normalize a large number of files.
Normalization ensures that the content receives the optimal handling and plays at the loudness generally desired by listeners. In the diagram, content is classified as music-only or as assorted – a mix of music, speech, interstitials etc. (Music-only content is normalized in a special manner, as discussed below in Album and Track Normalization.)
In the diagram, content is classified as music-only or as assorted – a mix of music, speech, interstitials etc. (Music-only content is normalized in a special manner, as discussed below in Album and Track Normalization.)
If the speech elements are not reliably measurable, such as when the speech is combined with music beds, sound effects, etc.,TD1008 recommends that the overall content be normalized to ‑18 LUFS before encoding and distribution.
When the content type is music, such as with on-demand music playback services, normalization is performed differently for full “albums” than for individual “tracks.”
Much recorded content is mastered in album format, where the loudness, order, and spacing of the tracks is adjusted for artistic balance or intent. If the loudness of each track was measured separately and the tracks normalized to a common target level, the original loudness intent of mastering would be lost. Album normalization prevents this by adjusting the loudness of all tracks in the album by the same gain value, maintaining the “album loudness” of the tracks relative to one another.
It is recommended that this album loudness value be that of the loudest track on the album instead of a loudness measurement of the entire album.
Additionally, if album loudness is available, studies indicate it is preferable to normalize the playback of individual tracks to the album loudness, even if they are played outside an album context.
The Album and Track Normalization Techniques animation begins with the Original Album section on the left, with four songs in an album group: each has a different Integrated Loudness varying from 11 LUFS to 16 LUFS (shown by the dashed bars).
Album Normalization is shown next in this animation: watch as the gains of all four songs are reduced together so that only the loudest track is 14 LUFS. Adjusting the gains in unison maintains the loudness differences from song to song, as intended by the content creator. This process is well-suited to on-demand music playback systems, which work with audio files.
Last, the Track Normalization is shown: each track gain is adjusted separately, as is common in radio-style production. In this case, TD1008 recommends adjusting each track to 16 LUFS for best matching with other online content. The drawback here is the loudness for all tracks is the same, whether the song was intended to be loud or soft.
The focus of this website is loudness for online audio content, but in discussion of target levels would not be complete without so some mention of other multimedia services.
Broadcast television began using integrated loudness (or average dialogue loudness) and normalization of audio worldwide in the 2005-2010 time frame. This was driven by viewer complaints of loud commercial content. The leading standards, listed in AES71 specify a similar 24 or 23 LUFS (or LKFS) target for both production and distribution to consumers.
Online video services, known as “Over the Top Television” (OTT) quickly became a major form of distribution and have adopted guidelines based on the recommendations documented as ANSI A/85, EBU R128 and AES71, and in some cases with customization to help support wide-dynamic range content and viewing on mobile devices.
On-demand music services distribute mostly popular commercial music, which are still driven by album production styles with very high integrated loudness. Fortunately, these services are adopting moderate loudness targets not only to help match music and other content, but to help provide more headroom to improve dynamic quality.
Many distributors of audiovisual or audio-only content are now using loudness normalization, but with different targets. Clearly, this is a challenging spread when audio is played on shared devices. Different operating loudness values is why the AES Broadcast and Online Delivery Technical Committee put forth recommendations in technical document TD1008 and helped develop the AES 71 recommended practice document that focuses on loudness normalization practices for audiovisual online distribution. The Consumer Electronics Association aligned with that standard to create ANSI/CTA-2075 to cover playback on devices. TD1008 is an evolving AES recommendation focusing on audio-only online distribution. Eventually, all content will carry metadata to identify its loudness target, which will allow playback devices to correct output to a common Integrated Loudness.
Major success has been achieved in loudness consistency by the worldwide TV broadcast industry, in European radio broadcasting, and by individual streaming audio providers using file-based normalization. For live content or when file-based normalization is not available, hardware and software devices are available to provide automated real-time correction to a target loudness. These devices can meet long-term loudness goals and may even avoid sudden increases in loudness, but they should be used advisedly: real time processes cannot “predict the future” for the remainder of a live program or stream, or know the artist’s intent to be louder or softer at times. Automatic correction may at times mistakenly adjust loudness and override the intended changes in a program’s dynamic range.
Active loudness controllers have algorithms to make aesthetically acceptable adjustment decisions. Most controllers work on these basic principles:
Real time correction has been in use in TV and OTT video for several years and is now appearing in products for online audio services.