ChatPaper.aiChatPaper

DiffVox:一种可微分模型,用于捕捉与分析专业特效分布

DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions

April 20, 2025
作者: Chin-Yun Yu, Marco A. Martínez-Ramírez, Junghyun Koo, Ben Hayes, Wei-Hsiang Liao, György Fazekas, Yuki Mitsufuji
cs.AI

摘要

本研究提出了一种新颖且可解释的模型——DiffVox,用于音乐制作中的人声效果匹配。DiffVox,全称为“可微分人声效果处理”,集成了参数均衡、动态范围控制、延迟和混响,并通过高效的微分实现支持基于梯度的参数优化。人声预设从两个数据集中提取,包括来自MedleyDB的70首曲目和私人收藏的365首曲目。参数相关性分析揭示了效果与参数间的紧密联系,例如高通滤波器和低架滤波器常协同作用以塑造低频部分,而延迟时间则与延迟信号的强度相关联。主成分分析揭示了与McAdams音色维度的关联,其中最重要的成分调节感知的空间感,次要成分则影响频谱亮度。统计测试证实了参数分布的非高斯特性,凸显了人声效果空间的复杂性。这些关于参数分布的初步发现为未来人声效果建模和自动混音研究奠定了基础。我们的源代码和数据集可在https://github.com/SonyResearch/diffvox获取。
English
This study introduces a novel and interpretable model, DiffVox, for matching vocal effects in music production. DiffVox, short for ``Differentiable Vocal Fx", integrates parametric equalisation, dynamic range control, delay, and reverb with efficient differentiable implementations to enable gradient-based optimisation for parameter estimation. Vocal presets are retrieved from two datasets, comprising 70 tracks from MedleyDB and 365 tracks from a private collection. Analysis of parameter correlations highlights strong relationships between effects and parameters, such as the high-pass and low-shelf filters often behaving together to shape the low end, and the delay time correlates with the intensity of the delayed signals. Principal component analysis reveals connections to McAdams' timbre dimensions, where the most crucial component modulates the perceived spaciousness while the secondary components influence spectral brightness. Statistical testing confirms the non-Gaussian nature of the parameter distribution, highlighting the complexity of the vocal effects space. These initial findings on the parameter distributions set the foundation for future research in vocal effects modelling and automatic mixing. Our source code and datasets are accessible at https://github.com/SonyResearch/diffvox.

Summary

AI-Generated Summary

PDF12April 23, 2025