Loudness Normalization

Home / Loudness Normalization

Loudness Normalization

Normalization Technique

Normalization is the active level matching of file-based audio content, such as a record album or radio program, to a defined “target loudness.” Normalizing avoids listener annoyance by matching loudness from one content asset to the next but it does not affect the quality of the content. At its most basic, file-based audio content is normalized by these steps:

The full length of the audio content is measured for its Integrated Loudness, in LUFS. See the Loudness Basics section for more information on loudness measurement. You can also view a video on loudness measurement, presented at the AES New York 2024 Convention by Cornelius Gould.
The amplitude of the entire audio content is then corrected so that the Integrated Loudness matches the target loudness. For example, if the target loudness is ‑24 LUFS (which is often used for audio content creation), and the content measures -27 LUFS, a gain offset of +3 LU to the content produces the target loudness.

For more information on loudness measurement see the Loudness Basics section.

When applying normalization, the loudness of the original audio (shown on the left in the figure) might be above or below the desired Distribution Loudness. In this case its Integrated Loudness is -13 LUFS, which is greater than the “target” of -18 LUFS. (See AES TD1008 for specific recommendations on target loudness.) Normalization attenuates the audio by 5 LU to the target value. This process is commonly called Downward Normalization.

Normalization Workflows

In the distribution of file-based audio for consumers, loudness normalization is often an automatic procedure. Automated loudness normalization utilities are designed to not only be useful on single (one-off) media files, but also as a batch process that can normalize a large number of files.

Normalization ensures that the content receives the optimal handling and plays at the loudness generally desired by listeners. In the diagram, content is classified as music-only or as assorted – a mix of music, speech, interstitials etc. (Music-only content is normalized in a special manner, as discussed below in Album and Track Normalization.)

In the diagram, content is classified as music-only or as assorted – a mix of music, speech, interstitials etc. (Music-only content is normalized in a special manner, as discussed below in Album and Track Normalization.)

If assorted content has speech elements that are measurable (usually with automatic software) then the gain of the overall content is adjusted so that the dialog Integrated Loudness is ‑18 LUFS. Note that the gain is set only once: no gain-riding is performed, so the content producer’s mix is preserved when encoded for distribution.

If the speech elements are not reliably measurable, such as when the speech is combined with music beds, sound effects, etc.,TD1008 recommends that the overall content be normalized to ‑18 LUFS before encoding and distribution.

When the content type is music, such as with on-demand music playback services, normalization is performed differently for full “albums” than for individual “tracks.”

Album and Track Normalization of Music for Distribution

Much recorded content is mastered in album format, where the loudness, order, and spacing of the tracks is adjusted for artistic balance or intent. If the loudness of each track was measured separately and the tracks normalized to a common target level, the original loudness intent of mastering would be lost. Album normalization prevents this by adjusting the loudness of all tracks in the album by the same gain value, maintaining the “album loudness” of the tracks relative to one another.

It is recommended that this album loudness value be that of the loudest track on the album instead of a loudness measurement of the entire album.

Additionally, if album loudness is available, studies indicate it is preferable to normalize the playback of individual tracks to the album loudness, even if they are played outside an album context.

The Album and Track Normalization Techniques animation begins with the Original Album section on the left, with four songs in an album group: each has a different Integrated Loudness varying from 11 LUFS to 16 LUFS (shown by the dashed bars).

Album Normalization is shown next in this animation: watch as the gains of all four songs are reduced together so that only the loudest track is 14 LUFS. Adjusting the gains in unison maintains the loudness differences from song to song, as intended by the content creator. This process is well-suited to on-demand music playback systems, which work with audio files.

Last, the Track Normalization is shown: each track gain is adjusted separately, as is common in radio-style production. In this case, TD1008 recommends adjusting each track to 16 LUFS for best matching with other online content. The drawback here is the loudness for all tracks is the same, whether the song was intended to be loud or soft.

Normalization Targets

The focus of this website is loudness for online audio content, but in discussion of target levels would not be complete without so some mention of other multimedia services.

Broadcast television began using integrated loudness (or average dialogue loudness) and normalization of audio worldwide in the 2005-2010 time frame. This was driven by viewer complaints of loud commercial content. The leading standards, listed in AES71 specify a similar 24 or 23 LUFS (or LKFS) target for both production and distribution to consumers.

Online video services, known as “Over the Top Television” (OTT) quickly became a major form of distribution and have adopted guidelines based on the recommendations documented as ANSI A/85, EBU R128 and AES71, and in some cases with customization to help support wide-dynamic range content and viewing on mobile devices.

On-demand music services distribute mostly popular commercial music, which are still driven by album production styles with very high integrated loudness. Fortunately, these services are adopting moderate loudness targets not only to help match music and other content, but to help provide more headroom to improve dynamic quality.

Many distributors of audiovisual or audio-only content are now using loudness normalization, but with different targets. Clearly, this is a challenging spread when audio is played on shared devices. Different operating loudness values is why the AES Broadcast and Online Delivery Technical Committee put forth recommendations in technical document TD1008 and helped develop the AES 71 recommended practice document that focuses on loudness normalization practices for audiovisual online distribution. The Consumer Electronics Association aligned with that standard to create ANSI/CTA-2075 to cover playback on devices. TD1008 is an evolving AES recommendation focusing on audio-only online distribution. Eventually, all content will carry metadata to identify its loudness target, which will allow playback devices to correct output to a common Integrated Loudness.

Active and Real Time Loudness Correction

Major success has been achieved in loudness consistency by the worldwide TV broadcast industry, in European radio broadcasting, and by individual streaming audio providers using file-based normalization. For live content or when file-based normalization is not available, hardware and software devices are available to provide automated real-time correction to a target loudness. These devices can meet long-term loudness goals and may even avoid sudden increases in loudness, but they should be used advisedly: real time processes cannot “predict the future” for the remainder of a live program or stream, or know the artist’s intent to be louder or softer at times. Automatic correction may at times mistakenly adjust loudness and override the intended changes in a program’s dynamic range.

Active loudness controllers have algorithms to make aesthetically acceptable adjustment decisions. Most controllers work on these basic principles:

Audio is analyzed using a method where the integrated LUFS value is based on a user-defined rolling average from several seconds to many minutes.
It is recommended that this period be between 30 seconds to two minutes. This allows for effective control of average loudness levels while maintaining most of the “natural dynamics” of the program source material. The analysis period balances effective control of average loudness levels while maintaining most of the “natural dynamics” of the program source material.
Longer integration times usually yield natural-sounding loudness normalization results with less audible artifacts.
Peak limiting is provided since gain may be applied to the controlled audio.
Resulting normalized audio is then available at the output of the “LUFS normalizer” to be used for general distribution.

Real time correction has been in use in TV and OTT video for several years and is now appearing in products for online audio services.

About AES

Code of Conduct

AES Conventions

AES Conferences

AES Training & Development

Gift Membership

AES Membership Benefits

Gift Membership

AES Membership Benefits

Become a Sustaining Member

AES Membership Benefits

AES Inside Track

Current Standards

Standards Blog

Journal of the AES

AES E-library

Special Publications

AES Sections are active around the world and provide a means for members to meet locally.

AES Student Website

AES Educational Foundation

Student Sections

See the committee’s accomplishments in diversity & inclusion

AES Statement of solidarity