You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / Journal-Online
Only AES members and Institutional Journal Subscribers can download
*Only AES members and Institutional Journal Subscribers can download.
Authors: Mcquillan, Jacob; van Walstijn, Maarten; Parker, Julian; Ortiz, Miguel
Reproducing the main features of a spring reverb tank impulse response across the hearing range with a physical model presents unique challenges because of the high levels of coupling between the spring’s vibrational polarizations. Previous attempts based on a model that includes helix angle can accurately simulate helical spring vibrations but see discrepancies in reproducing measured impulse responses due to heavy simplifications in specifying boundary conditions and input/output mechanisms. This paper presents an improved physical modeling approach by incorporating magnetic bead dynamics and frequency-dependent damping. The beads are modeled as coupled beams using a thin form of the spring equations that reduces to thin beam equations in the absence of curvature. Also ensuring the correct geometric alignment between the beads and the spring, the model’s response to rotationally driving the input bead is shown to display the expected mixture of waves traveling along the different spring polarizations. To achieve a similar damping profile as observed in measured impulse responses, different damping parameters are set for each polarization, leading to nonproportional damping and multiple decay rates within small frequency bands. The new formulation results in the main features of measured impulse responses now being reproduced well.
Download: PDF (46.36 MB)
Authors: Giudici, Gregorio Andrea; Caspe, Franco; Gabrielli, Leonardo; Squartini, Stefano; Turchet, Luca
This paper investigates the feasibility of running neural audio generative models on embedded systems, by comparing the performance of various models and evaluating their trade-offs in audio quality, inference speed, and memory usage. This work focuses on differentiable digital signal processing (DDSP) models, due to their hybrid architecture, which combines the efficiency and interoperability of traditional DSP with the flexibility of neural networks. In addition, the application of knowledge distillation (KD) is explored to improve the performance of smaller models. Two types of distillation strategies were implemented and evaluated: audio distillation and control distillation. These methods were applied to three foundation DDSP generative models that integrate Harmonic-Plus-Noise, FM, and Wavetable synthesis. The results demonstrate the overall effectiveness of KD: the authors were able to train student models that are up to 100× smaller than their teacher counterparts while maintaining comparable performance and significantly improving inference speed and memory efficiency. However, cases where KD failed to improve or even degrade student performance have also been observed. The authors provide a critical reflection on the advantages and limitations of KD, exploring its application in diverse use cases and emphasizing the need for carefully tailored strategies to maximize its potential.
Download: PDF (7.63 MB)
Authors: Lee, Sungho; Martínez-Ramírez, Marco A.; Liao, Wei-Hsiang; Uhlich, Stefan; Fabbro, Giorgio; Lee, Kyogu; Mitsufuji, Yuki
Reverse engineering of music mixes aims to uncover how dry source signals are processed and combined to produce a final mix. In this paper, prior works are extended to reflect the compositional nature of mixing and search for a graph of audio processors. First, a mixing console is constructed, applying all available processors to every track and subgroup. With differentiable processor implementations, their parameters are optimized with gradient descent. Next, the process of removing negligible processors and fine-tuning the remaining ones is repeated. This way, the quality of the full mixing console can be preserved while removing approximately two-thirds of the processors. The proposed method can be used not only to analyze individual music mixes but also to collect large-scale graph data for downstream tasks such as automatic mixing. Especially for the latter purpose, efficient implementation of the search is crucial. To this end, an efficient batch-processing method that computes multiple processors in parallel is presented. Also, the “dry/wet” parameter of each processor is exploited to accelerate the search. Extensive quantitative and qualitative analyses are conducted to evaluate the proposed method’s performance, behavior, and computational cost.
Download: PDF (42.15 MB)
Authors: Baker, Thomas; Bennett, Christopher
Antiderivative antialiasing (ADAA) has proven to be an effective approach for reducing aliasing in mathematically defined nonlinear functions. This paper explores the application of ADAA to Chebyshev-based generalized Hammerstein models, which are utilized for blackbox modeling of nonlinearities in digital audio effects. The Chebyshev-based model eliminates certain matrix operations and therefore offers advantages over polynomial-based models. By integrating ADAA, this enhanced Chebyshev model achieves substantial aliasing reductions, comparable to upsampling. Both explicit and recursive implementations of a Chebyshev model are developed and evaluated for alias reduction, waveshape fidelity, and computational efficiency. The results demonstrate the potential of ADAA to enhance Chebyshev polynomials for modeling of nonlinear systems, making it a valuable technique for real-time audio processing.
Authors: Giampiccolo, Riccardo; Ravasi, Stefano; Bernardini, Alberto
Analog audio effects can be realized as digital audio effects according to the formalism of virtual analog modeling. Among white box techniques, wave digital filters (WDFs) have lately shown to be instrumental for digitizing complex nonlinear analog circuits because they are characterized by interesting properties that allow for efficient implementations. However, building audio plug-ins requires manually designing WDFs and graphical user interfaces, tasks that are time consuming, thus hindering rapid prototyping. In this paper, the authors propose VIOLA, a framework for the automatic generation of audio plug-ins based on WDFs. Starting from a SPICE netlist, VIOLA generates an audio plug-in, taking advantage of the latest advancements in the WDF theory and of the MATLAB Audio Toolbox. In this release, the authors take into account circuits containing only diodes as nonlinear elements, but VIOLA is already structured to accommodate other nonlinearities. The proposed framework is tested for the implementation of two famous audio effects, namely the Electro-Harmonix Op Amp Big Muff Pi and the Digitech Overdrive Preamp 250, paving the way toward the fast prototyping of virtual analog audio plug-ins.
Authors: Dzwonczyk, Luke; Cella, Carmine-Emanuele; Ban, David
This paper presents the first steps toward the creation of a tool which enables artists to create music visualizations using pretrained, generative, machine learning models. First, the authors investigate the application of network bending, the process of applying transforms within the layers of a generative network, to image generation diffusion models by utilizing a range of point-wise, tensor-wise, and morphological operators. A number of visual effects that result from various operators, including some that are not easily recreated with standard image editing tools, are identified. The authors find that this process allows for continuous, fine-grain control of image generation, which can be helpful for creative applications. Next, music-reactive videos are generated using Stable Diffusion by passing audio features as parameters to network bending operators. Finally, the authors comment on certain transforms that radically shift the image and the possibilities of learning more about the latent space of Stable Diffusion based on these transforms. This paper is an extended version of the paper “Network Bending of Diffusion Models,” which appeared in the 27th International Conference on Digital Audio Effects.
Download: PDF (39.89 MB)
Download: PDF (3.6 MB)
Download: PDF (5.73 MB)
Download: PDF (42.15 MB)
Download: PDF (7.63 MB)
Download: PDF (46.36 MB)
Download: PDF (61.69 KB)
Download: PDF (43.57 KB)
Download: PDF (46.43 KB)
Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.