AudioBERT：音訊知識增強語言模型

摘要

最近的研究發現，基於僅文本數據集預訓練的語言模型通常缺乏基本的視覺知識，例如日常物品的顏色。受到這一觀察的啟發，我們探討是否在聽覺知識方面存在類似的缺陷。為了回答這個問題，我們構建了一個名為AuditoryBench的新數據集，其中包含兩個用於評估聽覺知識的新任務。通過對基準測試的分析，我們發現語言模型也嚴重缺乏聽覺知識。為了解決這一限制，我們提出了一種新方法AudioBERT，通過檢索式方法擴充BERT的聽覺知識。首先，我們在提示中檢測聽覺知識範圍，以便有效查詢我們的檢索模型。然後，我們將音頻知識注入BERT，並在需要音頻知識時啟用低秩適應。我們的實驗表明，AudioBERT非常有效，在AuditoryBench上取得了優異的表現。數據集和代碼可在https://github.com/HJ-Ok/AudioBERT找到。

English

Recent studies have identified that language models, pretrained on text-only datasets, often lack elementary visual knowledge, e.g., colors of everyday objects. Motivated by this observation, we ask whether a similar shortcoming exists in terms of the auditory knowledge. To answer this question, we construct a new dataset called AuditoryBench, which consists of two novel tasks for evaluating auditory knowledge. Based on our analysis using the benchmark, we find that language models also suffer from a severe lack of auditory knowledge. To address this limitation, we propose AudioBERT, a novel method to augment the auditory knowledge of BERT through a retrieval-based approach. First, we detect auditory knowledge spans in prompts to query our retrieval model efficiently. Then, we inject audio knowledge into BERT and switch on low-rank adaptation for effective adaptation when audio knowledge is required. Our experiments demonstrate that AudioBERT is quite effective, achieving superior performance on the AuditoryBench. The dataset and code are available at https://github.com/HJ-Ok/AudioBERT.

AudioBERT：音訊知識增強語言模型

AudioBERT: Audio Knowledge Augmented Language Model

摘要

Summary

Support

Support