Multivector Reranking in the Era of Strong First-Stage Retrievers

Martinico, Silvio; Nardini, Franco Maria; Rulli, Cosimo; Venturini, Rossano

计算机科学 > 信息检索

arXiv:2601.05200 (cs)

[提交于 2026年1月8日 ]

标题：多向量重新排序在强大第一阶段检索器的时代

标题： Multivector Reranking in the Era of Strong First-Stage Retrievers

Authors:Silvio Martinico, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini

摘要：学习到的多向量表示增强了现代搜索系统的检索效果，但其在现实世界中的应用受到逐标记检索的高昂成本限制。因此，大多数系统采用\emph{收集和精炼}策略，其中轻量级的收集阶段会选择候选文档进行完整评分。然而，这种方法需要在大型逐标记索引上进行昂贵的搜索，并且常常遗漏在完整相似性下排名最高的文档。在本文中，我们在两个公开可用的数据集上复现了几种最先进的多向量检索方法，提供了当前多向量检索领域的清晰图景，并观察到逐标记收集的低效性。在此基础上，我们表明用单向量文档检索器——特别是学习到的稀疏检索器（LSR）——替换逐标记收集阶段可以产生更小且语义更连贯的候选集合。这将收集和精炼流程重新构造成已建立的两阶段检索架构。随着检索延迟的减少，使用两个神经编码器进行查询编码成为主要的计算瓶颈。为缓解这一问题，我们整合了最近的无推理LSR方法，证明它们在显著降低查询编码时间的同时保持了双编码器流程的检索效果。最后，我们研究了多种重排序配置，以平衡效率、内存和效果，并引入了两种优化技术来提前修剪低质量的候选文档。实证结果表明，这些技术在不损失质量的情况下，检索效率提高了最多1.8$\times$。总体而言，我们的两阶段方法在最先进的多向量检索系统上实现了超过$24\times$的加速，同时保持了可比或更优的检索质量。

摘要： Learned multivector representations power modern search systems with strong retrieval effectiveness, but their real-world use is limited by the high cost of exhaustive token-level retrieval. Therefore, most systems adopt a \emph{gather-and-refine} strategy, where a lightweight gather phase selects candidates for full scoring. However, this approach requires expensive searches over large token-level indexes and often misses the documents that would rank highest under full similarity. In this paper, we reproduce several state-of-the-art multivector retrieval methods on two publicly available datasets, providing a clear picture of the current multivector retrieval field and observing the inefficiency of token-level gathering. Building on top of that, we show that replacing the token-level gather phase with a single-vector document retriever -- specifically, a learned sparse retriever (LSR) -- produces a smaller and more semantically coherent candidate set. This recasts the gather-and-refine pipeline into the well-established two-stage retrieval architecture. As retrieval latency decreases, query encoding with two neural encoders becomes the dominant computational bottleneck. To mitigate this, we integrate recent inference-free LSR methods, demonstrating that they preserve the retrieval effectiveness of the dual-encoder pipeline while substantially reducing query encoding time. Finally, we investigate multiple reranking configurations that balance efficiency, memory, and effectiveness, and we introduce two optimization techniques that prune low-quality candidates early. Empirical results show that these techniques improve retrieval efficiency by up to 1.8$\times$ with no loss in quality. Overall, our two-stage approach achieves over $24\times$ speedup over the state-of-the-art multivector retrieval systems, while maintaining comparable or superior retrieval quality.

评论：	17页，2图，ECIR 2026
主题：	信息检索 (cs.IR)
引用方式：	arXiv:2601.05200 [cs.IR]
	(或者 arXiv:2601.05200v1 [cs.IR] 对于此版本)
	https://doi.org/10.48550/arXiv.2601.05200

提交历史

来自： Silvio Martinico [查看电子邮件]
[v1] 星期四， 2026 年 1 月 8 日 18:22:18 UTC (69 KB)

计算机科学 > 信息检索

标题：多向量重新排序在强大第一阶段检索器的时代

标题： Multivector Reranking in the Era of Strong First-Stage Retrievers

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 信息检索

标题： 多向量重新排序在强大第一阶段检索器的时代 显示英文标题

标题： Multivector Reranking in the Era of Strong First-Stage Retrievers

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：多向量重新排序在强大第一阶段检索器的时代