阿波羅:用於高品質音頻恢復的帶序列建模
Apollo: Band-sequence Modeling for High-Quality Audio Restoration
September 13, 2024
作者: Kai Li, Yi Luo
cs.AI
摘要
在現代社會中,音訊修復變得日益重要,不僅因為先進播放設備帶來高品質聽覺體驗的需求,也因為生成音訊模型的增強能力需要高保真度音訊。通常,音訊修復被定義為從損壞的輸入預測未受損音訊的任務,通常使用 GAN 框架進行訓練,以平衡感知和失真。由於音訊退化主要集中在中高頻範圍,特別是由編解碼器引起,一個關鍵挑戰在於設計一個能夠保留低頻信息並準確重建高質量中高頻內容的生成器。受高取樣率音樂分離、語音增強和音訊編解碼模型近期進展的啟發,我們提出了 Apollo,一個專為高取樣率音訊修復而設計的生成模型。Apollo 使用明確的頻帶分割模組來建模不同頻帶之間的關係,從而實現更一致和更高質量的修復音訊。在 MUSDB18-HQ 和 MoisesDB 資料集上評估,Apollo 在各種比特率和音樂類型下一貫優於現有的 SR-GAN 模型,特別擅長處理涉及多個樂器和人聲混合的複雜情境。Apollo 顯著提高了音樂修復的質量,同時保持了計算效率。Apollo 的原始碼可在 https://github.com/JusperLee/Apollo 公開獲取。
English
Audio restoration has become increasingly significant in modern society, not
only due to the demand for high-quality auditory experiences enabled by
advanced playback devices, but also because the growing capabilities of
generative audio models necessitate high-fidelity audio. Typically, audio
restoration is defined as a task of predicting undistorted audio from damaged
input, often trained using a GAN framework to balance perception and
distortion. Since audio degradation is primarily concentrated in mid- and
high-frequency ranges, especially due to codecs, a key challenge lies in
designing a generator capable of preserving low-frequency information while
accurately reconstructing high-quality mid- and high-frequency content.
Inspired by recent advancements in high-sample-rate music separation, speech
enhancement, and audio codec models, we propose Apollo, a generative model
designed for high-sample-rate audio restoration. Apollo employs an explicit
frequency band split module to model the relationships between different
frequency bands, allowing for more coherent and higher-quality restored audio.
Evaluated on the MUSDB18-HQ and MoisesDB datasets, Apollo consistently
outperforms existing SR-GAN models across various bit rates and music genres,
particularly excelling in complex scenarios involving mixtures of multiple
instruments and vocals. Apollo significantly improves music restoration quality
while maintaining computational efficiency. The source code for Apollo is
publicly available at https://github.com/JusperLee/Apollo.Summary
AI-Generated Summary