文件泛濫：擴展重新排序推論的後果

摘要

重新排序器，通常是跨編碼器，常被用來重新評分由較便宜的初始IR系統檢索的文件。這是因為，儘管昂貴，重新排序器被認為更有效。我們通過測量重新排序器在完整檢索中的表現，而不僅僅是重新評分第一階段檢索，來挑戰這一假設。我們的實驗揭示了一個令人驚訝的趨勢：在逐漸評分更多文件時，最佳的現有重新排序器提供遞減的回報，實際上在某個限制之後會降低質量。事實上，在這種情況下，重新排序器經常會將與查詢沒有詞彙或語義重疊的文件賦予高分。我們希望我們的研究結果能激發未來改進重新排序的研究。

English

Rerankers, typically cross-encoders, are often used to re-score the documents retrieved by cheaper initial IR systems. This is because, though expensive, rerankers are assumed to be more effective. We challenge this assumption by measuring reranker performance for full retrieval, not just re-scoring first-stage retrieval. Our experiments reveal a surprising trend: the best existing rerankers provide diminishing returns when scoring progressively more documents and actually degrade quality beyond a certain limit. In fact, in this setting, rerankers can frequently assign high scores to documents with no lexical or semantic overlap with the query. We hope that our findings will spur future research to improve reranking.

文件泛濫：擴展重新排序推論的後果

Drowning in Documents: Consequences of Scaling Reranker Inference

摘要

Support