Journal of the Audio Engineering Society

2004 March - Volume 52 Number 3


What do we mean by high resolution? The recording and replay chain is reviewed from the viewpoints of digital audio engineering and human psychoacoustics. An attempt is made to define high resolution and to identify the characteristics of a transparent digital audio channel. The theory and practice of selecting high sample rates such as 96 kHz and word lengths of up to 24 bit are examined. The relative importance of sampling rate and word size at various points in the recording, mastering, transmission, and replay chain is discussed. Encoding methods that can achieve high resolution are examined and compared, and the advantages of schemes such as lossless coding, noise shaping, oversampling, and matched preemphasis with noise shaping are described.

The major issues in decisions about the architecture of analog-to-digital converters (ADCs) for audio are discussed. In particular, some of the theoretical and practical issues associated with noise-shaping oversampling ADCs--single and multibit are considered. The approach taken is to look at ADCs in general, and then to discuss how the requirements of audio allow the use of one approach, and do not allow the use of another. Multibit oversampling noiseshaping ADCs are discussed in some detail because, at the time of writing, this architecture is increasingly dominant. Noise and signal-to-noise ratio (SNR) as well as other circuitspecific issues are not covered, except where they are affected significantly by ADC architecture choices.

Future Design Challenges for Audio Converter Products

Authors: Hayes, Julian; Pennock, John; Magrath, Anthony J.

Purveyors of ultrahigh-performance converter solutions are faced with an increasingly challenging technical and commercial environment. New modeling techniques and novel silicon structures are described that may provide potential solutions to some of the design and semiconductor process problems.

One-Bit Audio: An Overview

Authors: Reefman, Derk; Janssen, Erwin

An overview of 1-bit audio processing is presented. Several characteristics of the sigma-delta modulator (SDM), currently the most often used device to generate 1-bit code, are discussed, as well as some simple design methodologies of SDMs. It is shown that 1-bit audio is capable of carrying very high-quality audio. The total audio production chain, from recording to replay, is displayed and its feasibility demonstrated. Finally, some recent developments in the field of 1-bit audio codecs are summarized, which show a further improvement over the already excellent audio characteristics of the SDM.

Lossless Compression of 1-Bit Audio

Authors: Knapen, Eric; Reefman, Derk; Janssen, Erwin; Bruekers, Fons

A coding technique that is used to losslessly compress 1 bit audio data is introduced. The individual steps in the encoding and decoding process are detailed, and an example illustrating the complete algorithm is provided. The lossless compression performance of the algorithm is provided. The lossless compression performance of the algorithm and its dependence on various genres of music are discussed. To circumvent the classical problem of playing time uncertainty, intimately connected to any lossless coding technique, the concept of a playing time estimator is introduced, and feasibly of 1-bit compression is demonstrated.

Pulse-Code Modulation--An Overview

Authors: Lipshitz, Stanley P.; Vanderkooy, John

Pulse-code-modulation (PCM) encoding of digital audio signals has had a long and successful history in the era of the Compact Disc (CD). This brief survey paper argues that it forms the logical way to extend either the bandwidth or the signal-to-noise ratio of a digital audio system, or both, to encompass even higher resolution. Underpinning its operation there are the iron-clad theorems that govern both the sampling-and-reconstruction and the ditheredquantizing processes that lie at its heart. It is adaptable enough to allow fully distortion-free noise shaping to be used if wordlength reduction is necessary, provided that the wordlength is not reduced so far as to cause quantizer overload when using proper dithering.

Sample rates higher than 48 kHz allow freedom to tailor the audio response above 20 kHz in order to optimize the transient performance. A recording and reproduction chain may have pre- and postringing caused by brickwall band-limiting filters, but a single "apodizing" filter can substantially suppress the ringing and shorten the impulse response. The apodizing filter can be placed anywhere in the chain, but the mastering stage probably fits in best with current practice. The paper presents coefficients for a number of filters suitable for 96-kHz and 192- kHz sampled audio, and experimentation is encouraged. Some of the filters are symmetrical, but, taking into account the ear's sensitivity to preresponses, others have been optimized for a near-zero preresponse.

The MLP Lossless Compression System for PCM Audio

Authors: Gerzon, Michael A.; Craven, Peter G.; Stuart, J. Robert; Law, Malcolm J.; Wilson, Rhonda J.

Lossless compression provides bit-exact delivery of the original signal and is ideal where the highest possible confidence in the final sound quality is required. Meridian lossless packing (MLP) was adopted in 1999 as the lossless coding method used on DVD-Audio. MLP uses four principal strategies to reduce both the total quantity and the peak rate of encoded data. MLP can invert a matrix transformation losslessly, this allows a two-channel representation to be transmitted alongside a multichannel signal, with a minimal increase in the data rate. It is illustrated how the characteristics of the incoming audio affect the coding performance, and MLP's versatility, achieved by the use of substreams and an open-ended metadata specification, is demonstrated.

[feature] The AES has recently formed a new technical committee, under the direction of Mark Sandler, concerned with semantic audio analysis. This committee held an inaugural workshop at the AES 115th Convention in New York last year, during which three key specialists presented their ideas on audio semantics. Dan Ellis of the Laboratory for Recognition and Organization of Speech and Audio (LabROSA), Columbia University, New York, provided a broad overview of the topic with examples of possible applications. Michael Casey, from the Centre for Computational Creativity at City University, London, described studio tools and techniques using MPEG 7. Jürgen Herre, of the Fraunhofer Institute for Integrated Circuits (IIS) in Germany, spoke on semantic audio analysis and metadata standards. (Recordings of this workshop are available either on MP3 CD-ROM or audio cassette from Conference Media Group via the AES website link at:

Not available.

Standards and Information Documents

AES Standards Committee News


116th Convention Preview, Berlin


     Exhibit Previews

Audio Gets Smart: A Workshop on Semantic Audio Analysis

Audio for Games: Let the Games Continue

117th Convention, San Francisco, Call for Papers

26th Conference, Baarn, Call for Papers


High-Resolution Audio


News of the Sections

Upcoming Meetings

Available Literature

Membership Information

Advertiser Internet Directory

AES Annual Report

Sections Contacts Directory

AES Conventions and Conferences


Cover & Sustaining Members List

VIP List & Editorial Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content