ActionPiece:面向生成式推荐的动作序列上下文感知分词
ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation
February 19, 2025
作者: Yupeng Hou, Jianmo Ni, Zhankui He, Noveen Sachdeva, Wang-Cheng Kang, Ed H. Chi, Julian McAuley, Derek Zhiyuan Cheng
cs.AI
摘要
生成式推荐(GR)是一种新兴范式,它将用户行为离散化为令牌模式,并通过自回归方式生成预测结果。然而,现有的GR模型在令牌化时独立处理每个行为,为所有序列中的相同行为分配相同的固定令牌,而忽略了上下文关系。这种缺乏上下文感知的机制可能导致性能欠佳,因为相同的行为在不同上下文中可能具有不同的含义。为解决这一问题,我们提出了ActionPiece,在令牌化行为序列时显式地融入上下文信息。在ActionPiece中,每个行为被表示为一组物品特征,作为初始令牌。基于行为序列语料库,我们通过合并特征模式构建词汇表,这些新令牌的生成依据特征在单个集合内及相邻集合间的共现频率。考虑到特征集的无序性,我们进一步引入了集合排列正则化,它能够生成具有相同语义的行为序列的多种分割方式。在公开数据集上的实验表明,ActionPiece在NDCG@10指标上持续优于现有的行为令牌化方法,提升幅度达6.00%至12.82%。
English
Generative recommendation (GR) is an emerging paradigm where user actions are
tokenized into discrete token patterns and autoregressively generated as
predictions. However, existing GR models tokenize each action independently,
assigning the same fixed tokens to identical actions across all sequences
without considering contextual relationships. This lack of context-awareness
can lead to suboptimal performance, as the same action may hold different
meanings depending on its surrounding context. To address this issue, we
propose ActionPiece to explicitly incorporate context when tokenizing action
sequences. In ActionPiece, each action is represented as a set of item
features, which serve as the initial tokens. Given the action sequence corpora,
we construct the vocabulary by merging feature patterns as new tokens, based on
their co-occurrence frequency both within individual sets and across adjacent
sets. Considering the unordered nature of feature sets, we further introduce
set permutation regularization, which produces multiple segmentations of action
sequences with the same semantics. Experiments on public datasets demonstrate
that ActionPiece consistently outperforms existing action tokenization methods,
improving NDCG@10 by 6.00% to 12.82%.Summary
AI-Generated Summary