ChatPaper.aiChatPaper

通過使用 Token 池化,在最小化性能影響的情況下減少多向量檢索的足跡

Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

September 23, 2024
作者: Benjamin Clavié, Antoine Chaffin, Griffin Adams
cs.AI

摘要

在過去幾年裡,由ColBERT帶領的多向量檢索方法已成為神經資訊檢索中日益普及的方法。這些方法在記憶單元級別而非文件級別存儲表示時,展現出非常強大的檢索效能,特別是在跨領域情境下。然而,為了存儲大量相關向量所需的存儲空間和記憶體需求仍然是一個重要的缺點,阻礙了實際應用。本文介紹了一種基於簇的標記池化方法,以積極減少需要存儲的向量數量。這種方法可以將ColBERT索引的空間和記憶體佔用減少50%,幾乎不會降低檢索效能。該方法還可以進一步減少向量數量,將其減少66%至75%,在絕大多數數據集上,降低幅度保持在5%以下。重要的是,這種方法無需進行架構更改或查詢時處理,可以作為ColBERT等模型的簡單插入在索引過程中使用。
English
Over the last few years, multi-vector retrieval methods, spearheaded by ColBERT, have become an increasingly popular approach to Neural IR. By storing representations at the token level rather than at the document level, these methods have demonstrated very strong retrieval performance, especially in out-of-domain settings. However, the storage and memory requirements necessary to store the large number of associated vectors remain an important drawback, hindering practical adoption. In this paper, we introduce a simple clustering-based token pooling approach to aggressively reduce the number of vectors that need to be stored. This method can reduce the space & memory footprint of ColBERT indexes by 50% with virtually no retrieval performance degradation. This method also allows for further reductions, reducing the vector count by 66%-to-75% , with degradation remaining below 5% on a vast majority of datasets. Importantly, this approach requires no architectural change nor query-time processing, and can be used as a simple drop-in during indexation with any ColBERT-like model.

Summary

AI-Generated Summary

PDF112November 16, 2024