ChatPaper.aiChatPaper

基於提示控制的通用歌曲生成框架

Versatile Framework for Song Generation with Prompt-based Control

April 27, 2025
作者: Yu Zhang, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang, Zhou Zhao
cs.AI

摘要

歌曲生成技術專注於基於多樣提示來產生可控且高品質的歌曲。然而,現有方法在基於提示控制及適當對齊的情況下生成人聲與伴奏方面存在困難。此外,這些方法在支持多種任務上也顯不足。為應對這些挑戰,我們引入了VersBand,這是一個多任務歌曲生成框架,旨在合成具有提示控制、高品質且對齊的歌曲。VersBand包含以下主要模型:1) VocalBand,一個解耦模型,利用流匹配方法來生成演唱風格、音高和梅爾頻譜圖,實現快速、高品質且風格可控的人聲生成。2) AccompBand,一個基於流的變壓器模型,整合了Band-MOE,選擇合適的專家以提升質量、對齊度和控制性。該模型能夠生成與人聲對齊、可控且高品質的伴奏。3) 兩個生成模型,LyricBand用於歌詞,MelodyBand用於旋律,共同構成了全面的多任務歌曲生成系統,允許基於多種提示進行廣泛控制。實驗結果表明,VersBand在多項歌曲生成任務中,無論是客觀還是主觀指標上,均優於基準模型。音頻樣本可在https://VersBand.github.io獲取。
English
Song generation focuses on producing controllable high-quality songs based on various prompts. However, existing methods struggle to generate vocals and accompaniments with prompt-based control and proper alignment. Additionally, they fall short in supporting various tasks. To address these challenges, we introduce VersBand, a multi-task song generation framework for synthesizing high-quality, aligned songs with prompt-based control. VersBand comprises these primary models: 1) VocalBand, a decoupled model, leverages the flow-matching method for generating singing styles, pitches, and mel-spectrograms, allowing fast, high-quality vocal generation with style control. 2) AccompBand, a flow-based transformer model, incorporates the Band-MOE, selecting suitable experts for enhanced quality, alignment, and control. This model allows for generating controllable, high-quality accompaniments aligned with vocals. 3) Two generation models, LyricBand for lyrics and MelodyBand for melodies, contribute to the comprehensive multi-task song generation system, allowing for extensive control based on multiple prompts. Experimental results demonstrate that VersBand performs better over baseline models across multiple song generation tasks using objective and subjective metrics. Audio samples are available at https://VersBand.github.io.

Summary

AI-Generated Summary

PDF11April 29, 2025