ChatPaper.aiChatPaper

SymDPO:通過符號示範直接偏好優化,提升大型多模態模型的上下文學習

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

November 17, 2024
作者: Hongrui Jia, Chaoya Jiang, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang
cs.AI

摘要

隨著語言模型的不斷擴展,大型語言模型(LLMs)展現出新興的「上下文學習」(ICL)能力,使它們能夠通過在上下文中加入少量的上下文示範(ICDs)來解決語言任務。受到這些進展的啟發,研究人員將這些技術擴展到具有ICL能力的大型多模型模型(LMMs)的開發。然而,現有的LMMs面臨一個關鍵問題:它們通常無法有效地利用多模式示範中的視覺上下文,而僅僅是遵循文本模式。這表明LMMs未能實現多模式示範和模型輸出之間的有效對齊。為解決這個問題,我們提出了「符號示範直接偏好優化」(SymDPO)。具體而言,SymDPO的目標是打破傳統築造多模式示範的範式,通過使用隨機符號來取代實例中的文本答案。這迫使模型仔細理解示範圖像,並建立圖像與符號之間的關係,以正確回答問題。我們在多個基準測試上驗證了這種方法的有效性,表明使用SymDPO,LMMs能夠更有效地理解示例中的多模式上下文,並利用這一知識更好地回答問題。
English
As language models continue to scale, Large Language Models (LLMs) have exhibited emerging capabilities in In-Context Learning (ICL), enabling them to solve language tasks by prefixing a few in-context demonstrations (ICDs) as context. Inspired by these advancements, researchers have extended these techniques to develop Large Multimodal Models (LMMs) with ICL capabilities. However, existing LMMs face a critical issue: they often fail to effectively leverage the visual context in multimodal demonstrations and instead simply follow textual patterns. This indicates that LMMs do not achieve effective alignment between multimodal demonstrations and model outputs. To address this problem, we propose Symbol Demonstration Direct Preference Optimization (SymDPO). Specifically, SymDPO aims to break the traditional paradigm of constructing multimodal demonstrations by using random symbols to replace text answers within instances. This forces the model to carefully understand the demonstration images and establish a relationship between the images and the symbols to answer questions correctly. We validate the effectiveness of this method on multiple benchmarks, demonstrating that with SymDPO, LMMs can more effectively understand the multimodal context within examples and utilize this knowledge to answer questions better.

Summary

AI-Generated Summary

PDF113November 21, 2024