Molar:具有协同过滤对齐的多模态LLMs,用于增强顺序推荐

Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation

December 24, 2024
作者: Yucong Luo, Qitao Qin, Hao Zhang, Mingyue Cheng, Ruiran Yan, Kefan Wang, Jie Ouyang
cs.AI

摘要

在过去的十年里,顺序推荐(SR)系统已经有了显著的发展,从传统的协同过滤转向深度学习方法,最近又发展到大型语言模型(LLMs)。虽然LLMs的采用推动了重大进展,但这些模型固有地缺乏协同过滤信息,主要依赖于文本内容数据,忽视了其他形式,因此未能实现最佳的推荐性能。为了解决这一局限性,我们提出了Molar,一种多模态大型语言顺序推荐框架,将多种内容形式与ID信息相结合,有效捕捉协同信号。Molar采用MLLM从文本和非文本数据生成统一的物品表示,促进全面的多模态建模,丰富物品嵌入。此外,它通过后对齐机制整合协同过滤信号,对齐基于内容和基于ID的模型的用户表示,确保精准的个性化和稳健的性能。通过无缝结合多模态内容和协同过滤见解,Molar捕捉了用户兴趣和上下文语义,从而提高了推荐准确性。大量实验证实,Molar明显优于传统和基于LLM的基准线,突显了其在利用多模态数据和协同信号进行顺序推荐任务中的优势。源代码可在https://anonymous.4open.science/r/Molar-8B06/找到。
English
Sequential recommendation (SR) systems have evolved significantly over the past decade, transitioning from traditional collaborative filtering to deep learning approaches and, more recently, to large language models (LLMs). While the adoption of LLMs has driven substantial advancements, these models inherently lack collaborative filtering information, relying primarily on textual content data neglecting other modalities and thus failing to achieve optimal recommendation performance. To address this limitation, we propose Molar, a Multimodal large language sequential recommendation framework that integrates multiple content modalities with ID information to capture collaborative signals effectively. Molar employs an MLLM to generate unified item representations from both textual and non-textual data, facilitating comprehensive multimodal modeling and enriching item embeddings. Additionally, it incorporates collaborative filtering signals through a post-alignment mechanism, which aligns user representations from content-based and ID-based models, ensuring precise personalization and robust performance. By seamlessly combining multimodal content with collaborative filtering insights, Molar captures both user interests and contextual semantics, leading to superior recommendation accuracy. Extensive experiments validate that Molar significantly outperforms traditional and LLM-based baselines, highlighting its strength in utilizing multimodal data and collaborative signals for sequential recommendation tasks. The source code is available at https://anonymous.4open.science/r/Molar-8B06/.

Summary

AI-Generated Summary

PDF152December 27, 2024