AudioBERT:音訊知識增強語言模型
AudioBERT: Audio Knowledge Augmented Language Model
September 12, 2024
作者: Hyunjong Ok, Suho Yoo, Jaeho Lee
cs.AI
摘要
最近的研究發現,基於僅文本數據集預訓練的語言模型通常缺乏基本的視覺知識,例如日常物品的顏色。受到這一觀察的啟發,我們探討是否在聽覺知識方面存在類似的缺陷。為了回答這個問題,我們構建了一個名為AuditoryBench的新數據集,其中包含兩個用於評估聽覺知識的新任務。通過對基準測試的分析,我們發現語言模型也嚴重缺乏聽覺知識。為了解決這一限制,我們提出了一種新方法AudioBERT,通過檢索式方法擴充BERT的聽覺知識。首先,我們在提示中檢測聽覺知識範圍,以便有效查詢我們的檢索模型。然後,我們將音頻知識注入BERT,並在需要音頻知識時啟用低秩適應。我們的實驗表明,AudioBERT非常有效,在AuditoryBench上取得了優異的表現。數據集和代碼可在https://github.com/HJ-Ok/AudioBERT找到。
English
Recent studies have identified that language models, pretrained on text-only
datasets, often lack elementary visual knowledge, e.g., colors of
everyday objects. Motivated by this observation, we ask whether a similar
shortcoming exists in terms of the auditory knowledge. To answer this
question, we construct a new dataset called AuditoryBench, which consists of
two novel tasks for evaluating auditory knowledge. Based on our analysis using
the benchmark, we find that language models also suffer from a severe lack of
auditory knowledge. To address this limitation, we propose AudioBERT, a novel
method to augment the auditory knowledge of BERT through a retrieval-based
approach. First, we detect auditory knowledge spans in prompts to query our
retrieval model efficiently. Then, we inject audio knowledge into BERT and
switch on low-rank adaptation for effective adaptation when audio knowledge is
required. Our experiments demonstrate that AudioBERT is quite effective,
achieving superior performance on the AuditoryBench. The dataset and code are
available at https://github.com/HJ-Ok/AudioBERT.Summary
AI-Generated Summary