beeFormer:在推薦系統中橋接語義相似性與交互作用相似性之間的鴻溝
beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems
September 16, 2024
作者: Vojtěch Vančura, Pavel Kordík, Milan Straka
cs.AI
摘要
推薦系統常使用文本側信息來提升預測能力,尤其在冷啟動或零樣本推薦場景中,傳統的協同過濾方法無法使用。近年來提出了許多用於推薦系統的文本挖掘側信息的方法,其中句子轉換器是最突出的一種。然而,這些模型是經過訓練以預測語義相似性,而沒有利用與推薦系統特定隱藏模式的交互數據。本文提出了 beeFormer,一個用於訓練句子轉換器模型的框架,並展示了我們使用 beeFormer 訓練的模型可以在不僅超越語義相似性句子轉換器,還超越傳統協同過濾方法的情況下,在數據集之間轉移知識。我們還展示了在來自不同領域的多個數據集上訓練可以在單個模型中累積知識,從而開啟了訓練通用、與領域無關的句子轉換器模型以挖掘文本表示來用於推薦系統的可能性。我們釋出了源代碼、訓練模型和其他詳細信息,以便複製我們的實驗,網址為 https://github.com/recombee/beeformer。
English
Recommender systems often use text-side information to improve their
predictions, especially in cold-start or zero-shot recommendation scenarios,
where traditional collaborative filtering approaches cannot be used. Many
approaches to text-mining side information for recommender systems have been
proposed over recent years, with sentence Transformers being the most prominent
one. However, these models are trained to predict semantic similarity without
utilizing interaction data with hidden patterns specific to recommender
systems. In this paper, we propose beeFormer, a framework for training sentence
Transformer models with interaction data. We demonstrate that our models
trained with beeFormer can transfer knowledge between datasets while
outperforming not only semantic similarity sentence Transformers but also
traditional collaborative filtering methods. We also show that training on
multiple datasets from different domains accumulates knowledge in a single
model, unlocking the possibility of training universal, domain-agnostic
sentence Transformer models to mine text representations for recommender
systems. We release the source code, trained models, and additional details
allowing replication of our experiments at
https://github.com/recombee/beeformer.Summary
AI-Generated Summary