Qilin:一个包含应用级用户会话的多模态信息检索数据集
Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions
March 1, 2025
作者: Jia Chen, Qian Dong, Haitao Li, Xiaohui He, Yan Gao, Shaosheng Cao, Yi Wu, Ping Yang, Chen Xu, Yao Hu, Qingyao Ai, Yiqun Liu
cs.AI
摘要
用户生成内容(UGC)社区,尤其是那些包含多模态内容的平台,通过将视觉与文本信息整合至结果(或条目)中,显著提升了用户体验。近年来,在配备搜索与推荐(S&R)服务的复杂系统中优化用户体验的挑战,已引起学术界与工业界的广泛关注。然而,高质量数据集的匮乏限制了多模态S&R研究的进展。为应对开发更优S&R服务的迫切需求,本文推出了一种新颖的多模态信息检索数据集——Qilin。该数据集采集自小红书,一个拥有超过3亿月活跃用户且平均搜索渗透率超70%的流行社交平台。与现有数据集相比,Qilin提供了包含图文笔记、视频笔记、商业笔记及直接答案等异质结果的完整用户会话集合,为跨多种任务场景开发先进的多模态神经检索模型奠定了基础。为更好地建模用户满意度并支持异质用户行为分析,我们还收集了丰富的APP级上下文信号及真实用户反馈。值得注意的是,Qilin包含了触发深度问答(DQA)模块的搜索请求中用户偏爱的答案及其引用结果,这不仅支持检索增强生成(RAG)管道的训练与评估,还便于探索此类模块如何影响用户的搜索行为。通过全面的分析与实验,我们为S&R系统的进一步优化提供了有趣的发现与见解。我们期待Qilin未来能在推动配备S&R服务的多模态内容平台发展方面做出重要贡献。
English
User-generated content (UGC) communities, especially those featuring
multimodal content, improve user experiences by integrating visual and textual
information into results (or items). The challenge of improving user
experiences in complex systems with search and recommendation (S\&R) services
has drawn significant attention from both academia and industry these years.
However, the lack of high-quality datasets has limited the research progress on
multimodal S\&R. To address the growing need for developing better S\&R
services, we present a novel multimodal information retrieval dataset in this
paper, namely Qilin. The dataset is collected from Xiaohongshu, a popular
social platform with over 300 million monthly active users and an average
search penetration rate of over 70\%. In contrast to existing datasets,
Qilin offers a comprehensive collection of user sessions with
heterogeneous results like image-text notes, video notes, commercial notes, and
direct answers, facilitating the development of advanced multimodal neural
retrieval models across diverse task settings. To better model user
satisfaction and support the analysis of heterogeneous user behaviors, we also
collect extensive APP-level contextual signals and genuine user feedback.
Notably, Qilin contains user-favored answers and their referred results for
search requests triggering the Deep Query Answering (DQA) module. This allows
not only the training \& evaluation of a Retrieval-augmented Generation (RAG)
pipeline, but also the exploration of how such a module would affect users'
search behavior. Through comprehensive analysis and experiments, we provide
interesting findings and insights for further improving S\&R systems. We hope
that Qilin will significantly contribute to the advancement of
multimodal content platforms with S\&R services in the future.Summary
AI-Generated Summary